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Description 

Background of the Invention : 

5 1) Field of the Invention 

The present invention relates to a novel LDL receptor analog protein having a structure similar to that of LDL recep- 
tors that are responsible for the homeostasis mechanism of intracellular cholesterol and extensively participates in 
serum lipid metabolism, which is a critical factor that triggers the onset of arteriosclerosis. The invention also relates to 
10 the gene coding for the protein. 

2) Description of the Related Art 

Abnormality in serum lipid metabolism is one of the most critical risk factors in the onset and progress of arterio- 
15 sclerosis. Serum lipids, together with apolipoproteins, are transformed into lipoproteins primarily in the liver, secreted 
therefrom, transported by blood, and taken up by a variety of tissue cells. 

Uptake of lipoproteins into cells occurs primarily by the mediation of receptors of respective lipoproteins. It is known 
that low density lipoproteins (LDL), which are taken into cells by specific membrane receptors, called LDL receptors, 
are metabolized within the cells and utilized as cell membrane components or similar substances. Detailed analysis of 
20 familial hypercholesterolemia, which is a genetic disease accompanied by notable hypercholesterolemia due to abnor- 
mality of LDL receptors, has clarified details of the mechanism of homeostasis achieved by LDL receptors with respect 
to intracellular cholesterol. 

It has been suggested that living bodies have not only LDL receptors but also cell membrane receptors that recog- 
nize other lipoproteins. From analyses of WHHL rabbits, which are model animals lacking LDL receptors, it was found 

25 that receptors which takes principally apo-E-containing lipoproteins as ligands (remnant receptors) are present in the 
liver. It is also predicted that there may be HDL receptors whose ligands are high density lipoprotein (HDL). However, 
to date, details of the structures and functions of these receptors have not yet been elucidated. It has also been known 
that foaming of macrophages plays an active role in the formation of atherosclerosis, is deeply participated. Macro- 
phages foam by taking up modified LDL — not normal LDL— which have undergone oxidation, acetylation, or glycation. 

30 There have recently been discovered receptors to modified LDL which are called scavenger receptors. The scavenger 
receptors have been identified to be membrane receptors that have a structure completely different from that of LDL 
receptors. 

Recent research using molecular biological techniques has identified the genes of LRP (LDL receptor-associated 
protein), gp 330. and VLDL receptors. The receptors have been found to have structures very similar to those of LDL 

35 receptors. From analyses of these receptors, it is believed that a plurality of lipoprotein receptors are present in living 
bodies, and that they are closely related to lipid metabolism. LDL receptors studied in detail by Brown and Goldstein 
[Brown, M.S. and Goldstein, J.L (1986) Science 232, 34-47] are known to play an important role in the homeostasis of 
lipoprotein metabolism in vivo, recognizing apo-B-100 and apo-E and taking primarily LDL as their ligands. Also, LRP, 
which is a macroprotein, has been found to primarily recognize apo-E and to take p-VLDL or chylomicron remnant as 

40 a ligand. Moreover, it has been recently reported that LRP takes an c^-macroglobulin/protease complex or a plasmino- 
gen activator/plasminogen activator inhibitor-1 complex as a ligand, and that LRP is a protein identical to the a 2 -mac- 
roglobulin receptor. When these findings are taken together, LRP is considered to have a wide variety of functions in 
living bodies [Herz, J., Hamann, U., Rogne, S., Myklebost, O., Gausepohl, H. and Stanley, K.K. (1989) EMBO J. 7(13), 
4119-4127; Brown, M.S., Herz, J., Kowal, R.C. and Goldstein, J.L. (1991) Current Opinion in Lipidology 2, 65-72; Herz, 

45 J. (1993) Current Opinion in Lipidology 4, 107-1 13]. The gp 330, which was first identified as an antigen inducing rat 
Heymann nephritis, has been reported to have a ligand-binding capacity similar to that possessed by CRP a 2 -mac- 
roglobulin receptor [Raychowdhury, R., Niles, J.L, McCluskey, R.T. and Smith, J.A. (1989) Science 244, 1163-1165; 
Pietromonaco, S., Kerjaschki, D., Binder, S., Ullrich, R. and Farquhar, G. (1990) Proc. Natl. Acad. Sci. U.S.A. 87, 1811- 
1815]. In addition, recently discovered VLDL receptors, which are found to take VLDL as a ligand, are considered to 

so have new functions including fatty acid metabolism, because they are predominantly found in tissues of the heart and 
muscles though they are rarely found in the liver [Takahashi, S. ( Kawarabayashi, Y, Nakai, T, Sakai. J. and Yamamoto, 
T. (1992) Proc. Natl. Acad. Sci. USA 89, 9252-9256]. 

Functions of these newly found receptors as lipoprotein receptors have been gradually elucidated through detailed 
in vitro analyses. However, significance of respective receptors in living bodies has mostly been left unknown. In addi- 

55 tion, relations to remnant receptors. HDL receptors, etc., which have conventionally been identified or suggested by bio- 
chemical techniques, remain unknown. Presently, it is considered that these newly found receptors are products of 
genes different from those of the latter receptors. Thus, more lipoprotein receptors than originally guessed have 
become considered to participate in lipoprotein uptake into cells while interacting with each other to thereby function to 
maintain homeostasis of lipid metabolism in living bodies. However, from structural analyses of the genes of the afore- 
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mentioned newly-identified receptors, it is predicted that the genes of these receptors that take lipoproteins as ligands 
are developed from the same gene from which LDL receptors was developed, and thus they are within the same genetic 
family. This suggests that lipoprotein receptors that have conventionally been proposed may have structures similar to 
those of LDL receptors. 

s Accordingly, an object of the present invention is to provide the gene of a novel receptc r in the LDL receptor family, 

as well as a protein coded by the gene. 

The present inventors conducted careful studies so as to attain the above object, and found that by using part of 

rabbit LDL receptor cDNA as a probe there can be obtained a DNA fragment coding for a peptide having a structure 

similar to that of LDL receptors. Moreover, when using part of the obtained cDNA as a probe, a cDNA fragment having 
w a sequence similar to that of the cDNA can be obtained from the human tissue cDNA library. The present invention was 

accomplished based on these findings. 

Summary of the Invention 

is The present invention provides DNA having a nucleotide sequence shown by Sequence ID No. 1 or No. 5; an LDL 
receptor analog protein having an amino acid sequence coded by the DNA; a recombinant vector comprising the DNA 
and a replicable vector; transformant cells which harbor the recombinant vector; and a method for the production of the 
LDL receptor analog protein. 

20 Description of Preferred Embodiment 

The cDNA of the present invention may be prepared, for example, by the following process. 
Briefly, the process includes the following steps. (1) Through the use of rabbit LDL receptor cDNA as a probe, pos- 
itive clones are screened out of a rabbit liver cDNA library. (2) Recombinant DNA is prepared using the separated pos- 

25 itive clones, and a cDNA fragment is cut out of the resultant recombinant DNA through a treatment using a restriction 
enzyme The cDNA fragment is integrated into a plasmid vector. (3) Host cells are transformed using the obtained 
cDNA recombinant vector to thereby obtain transformant cells of the present invention. The obtained transformant cells 
are incubated so as to obtain a recombinant vector containing a DNA fragment of the present invention. The nucleotide 
sequence of the DNA fragment of the present invention contained in the resultant recombinant vector is determined. (4) 

30 In tissue of a living body, there is detected expression of mRNA indicated by the nucleotide sequence of the cDNA of 
the present invention by using RNA blot hybridization method. (5) Through use of a rabbit cDNA fragment as a probe, 
positive clones are screened out of a human tissue cDNA library, and the nucleotide sequence of the clones is deter- 
mined. (6) A recombinant vector for expression is prepared using the cDNA of the present invention. Through use of the 
thus-obtained vector, host cells are transformed to thereby obtain the transformants of the present invention. (7) Lig- 

35 ands that are bound to protein expressed by the obtained transformants are detected by ligand blotting. 
Each of the above-described steps will next be described. 

(1) Screening for positive clones from a rabbit liver cDNA library: 

40 A cDNA library may be prepared by the use of mRNA obtained from rabbit liver, reverse transcriptase, and a suit- 
able vector, e.g., commercially available >xjt10 vector. 

A cDNA library thus prepared using Agt10 as a vector is subjected to a screening for positive clones by the appli- 
cation of a DNA hybridization method employing a cDNA probe, to thereby separate positive clones [Sambrook, J., 
Fritsch, E.F. and Maniatis, T. (1989) In: Molecular Cloning: A Laboratory Manual, pp 9.47-9.58, Cold Spring Harbor Lab- 

45 oratory Press]. 

An exemplary cDNA which may be used as a probe is rabbit LDL receptor cDNA. Positive clones may be detected 
by autoradiography employing a DNA probe labelled with a radioisotope ( 32 P). 

(2) Preparation o1 a cDNA recombinant vector: 

so 

Recombinant vector Xgt10 phage DNA is extracted from the isolated positive dones and purified. The resultant 
purified recombinant vector Xgt1 0 phage DNA is digested with a restriction enzyme EcoRI, to thereby separate a cDNA 
fragment from the vector DNA. The obtained cDNA fragment is integrated with a plasmid vector for cloning that has 
been similarly digested with EcoRI. thereby obtaining a recombinant plasmid vector. An exemplary plasmid vector 
55 which may be used is pBluescript II. 

(3) Recombinant vector, transformation of host cells using the recombinant vector, and preparation of DNA: 

The obtained cDNA recombinant vector is introduced into a variety of host cells that are capable of utilizing the 
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genetic marker possessed by the recombinant vector, to thereby transform the host ceils. Host cells are not particularly 
limited, with E. coli being preferred. For example, a variety of variants of the E. cofi K12 strain, e.g., HB-101 may be 
used. In order to introduce the recombinant vector into host cells, a competent cell method may be used in combination 
with a treatment with calcium. 

5 The thus-obtained transformant cells are cultured in a selective medium in accordance with the genetic marker of 

the vector. The recombinant vector of the present invention is collected from the cultured cells. The DNA nucleotide 
sequence of the cDNA contained in the obtained recombinant vector can be determined through use of a dideoxy 
sequence method [Sanger, F, Nicklen, S. and Coulson, A.R. (1977) Proc. Natl. Acd. Sci. USA 74, 5463-5467]. 

io (4) RNA blot hybridization: 

The expression in tissue of mRNA, indicated by the nucleotide sequence of the cDNA of the present invention, is 
detected using RNA blot hybridization. 

First, mRNA is prepared using rabbit tissue. Commercially available oligo(dT)cellulose column may be used for the 
15 preparation. In order to prepare mRNA from human tissue, there may be used a commercially available nylon mem- 
brane on which tissue poly(A) + RNA from a variety of sources is present. 

An exemplary probe is the rabbit cDNA obtained in the above-described step (3). mRNA may be detected by auto- 
radiography employing a DNA probe labelled with a radioisotope ( 32 P). 

20 (5) Screening of human tissue cDNA library for positive clones, and determination of nucleotide sequence: 

An exemplary human tissue cDNA library which may be used is a commercially available human brain cDNA 
library. 

Screening and nucleotide sequencing of the human brain cDNA library may be performed using a fragment of rab- 
25 bit cDNA of the present invention as a probe in a manner similar to that used for the aforementioned rabbit liver cDNA 
library. 

(6) Preparation of a recombinant vector for expression and transformation of host cells using the recombinant vector for 
expression: 

30 

In order to prepare an LDL receptor analog protein through use of cDNA of the present invention, the obtained 
cDNA and a vector for expression are first bonded to each other to thereby create a recombinant vector for expression. 
Vectors for expression which may be used for bonding are not particularly limited. For example, pBK-CMV may be used. 

Host cells are transformed using the thus-obtained recombinant vector for expression, to thereby obtain a trans- 
35 formant cell of the present invention. The obtained transformant cell is cultured so as to obtain cells that are capable of 
expressing the protein of the invention. Host cells are not particularly limited. For example, CHO cells may be used. In 
order to introduce the recombinant vector for expression into host cells, a calcium phosphate method may be used. 

The thus-prepared transformant ceils are incubated in a selective medium in accordance with the genetic marker 
of the vector, so as to express the LDL receptor analog protein of the present invention. 

40 

(7) Ligand analysis of the protein by ligand blotting: 

After the resultant transformant cells are incubated, the expressed LDL receptor analog protein is solubilized using 
a solubilizer, e.g., Triton X-100, to thereby obtain a membrane protein fraction. The fraction is separated using SDS- 
45 PAGE, and transferred onto, for example, a nitrocellulose membrane. Using a radio-labelled ( 125 l) lipoprotein as a 
probe, the analog protein can be detected by autoradiography. Exemplary lipoproteins which may be used include 6- 
VLDLand LDL 

Examples: 

50 

The present invention will next be described in detail by way of example, which should not be construed as limiting 
the invention. 

Example 1 : 

55 

Preparation of a rabbit liver cDNA library: 

Rom tissue of the liver of a male Japanese white rabbit, intact RNA was extracted through a guanidium thiocy- 
anate/cesium chloride method. The obtained intact RNA was subjected to an oligo (dT) cellulose column method to 
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thereby obtain purified poly(A) + RNA. 

cDNA was synthesized in accordance with a method of Gubler and Hoffman [Gubler, U. and Hoffman, B.J. (1983) 
Gene 25, 263]. Briefly, cDNA was synthesized employing rabbit liver poly(A) + RNA (as a template), a random primer, 
and moloney murine leukemia virus reverse transcriptase. The synthesized cDNA was transformed into double- 

5 stranded DNA using DNA polymerase I, and then subjected to an EcoRI methylase treatment. By the use of T4 DNA 
polymerase, the DNA was blunt-ended. The blunt-ended DNA was ligated to phosphorylated EcoRI linker pd 
(CCGAATTCGG) using a T4 DNA ligase, and the resultant ligated product was subjected to an additional digestion with 
EcoRI. cDNA fragments having a size not less than 1 kb were selected by agarose gel electrophoresis, and integrated 
into the EcoRI-digested site of XgttO phage DNA using a T4 DNA ligase. The phage DNA was packaged in vitro, to 

10 thereby establish a rabbit liver cDNA library. 

Example 2: 

Cloning of cDNA of receptors in the rabbit LDL receptor family: 

15 

The cDNA library (1,000,000 plaques) prepared in Example 1 was subjected to screening using a plaque hybridi- 
zation method and employing as a probe a segment of the cDNA obtained from a ligand binding region, the functional 
region, of the rabbit LDL receptor. Hybridization was performed at 42°C using 5 x SSC, 30% formamide, 1% SDS, 5 x 
Denhardt's, and 100 ng/ml salmon sperm DNA (ssDNA), followed by washing with 0.3 x SSC/0.1% SDS at 48°C. As a 

20 result, several positive clones were obtained. These cDNA clones were separated by performing this plaque hybridiza- 
tion method in a plurality of times. Subsequently, a cDNA fragment of each phage was subcloned into a plasmid vector 
pBluescript II, and the nucleotide sequence was analyzed using a dideoxy sequence method [Sanger, F, Nicklen, S. 
and Coulson, A.R. (1977) Proc. Natl. Acd. Sci. USA 74, 5463-5467]. Based on a putative amino acid sequence, LDL 
receptors themselves were excluded, and cDNA clones having a sequence very similar to that of LDL receptors were 

25 identified. Using these clones as cDNA probes, the cDNA library was screened to thereby obtain overlapping two 
clones. These were employed as new probes and similar procedure was performed, so as to obtain 5 cDNA clones. The 
DNA nucleotide sequence determined by these cDNA clones are shown as Sequence ID No. 3. The total length of the 
sequence was 6961 bp. In the open reading frame of 6639 bp (Sequence ID No. 1 ) which contained a sequence exhib- 
iting high homology with LDL receptors, there existed on the 5' side an ATG codon which was presumably a translation 

30 initiating site and a successive highly hydrophobic sequence consisting of about 30 amino acids. Accordingly, the 
obtained cDNA was considered to contain the entirety of its length. A putative amino acid sequence is shown as 
Sequence ID No. 2. The protein consisted of 2213 amino acids. Comparison of the amino acid sequence of the protein 
with other amino acid sequence data registered at the Genebank, there was a very high similarity to LDL receptors. 
That is, amino acids 700 - 1 , 1 00 in the sequence were very similar to the EGF precursor homology region of LDL recep- 

35 tors, and amino acids 1,100 - 1,640 were also very similar to the ligand binding region of LDL receptors. When the 
amino acid sequence of the subject protein was compared with other lipoprotein receptor LRP, gp330, and VLDL recep- 
tors, similarity was not as high as that observed for LDL receptors. On the C-terminal side of the amino acid sequence 
of the protein, there was found a highly hydrophobic region which was very similar to the transmembrane region of LDL 
receptors. 

40 

Example 3: 

From liver tissue and brain tissue of a male Japanese white rabbit, intact RNA was extracted through a guanidium 
thiocyanate/cesium chloride method. The obtained intact RNA was subjected to an oligo (dT) cellulose column method 
45 to thereby obtain purified poly(A) + RNA. The poly(A) + RNA specimens (10 ng each) was modified via a glyoxal method, 
electrophoresed on 1% agarose gel, and transferred onto a nylon membrane. 

For human tissue mRNA, commercially available nylon membranes blotted with human tissue poly(A) + RNA from 
various sources were used. 

Using as a probe part of a 32 P-labelled rabbit cDNA of the present invention, hybridization was performed at 42°C 
so using 50% (rabbit) or 40% (human) formamide, 0. 1% SDS, 50 mM phosphate buffer, 5 x Denhardt's, 5 x SSC, and 200 
ng/ml of ssDNA, followed by washing with 0.1 x SSC and 0.1% SDS at 50°C. Autoradiography was performed at -70°C 
for 2 days in the presence of intensifying screen. As a result, in both rabbit liver tissue and brain tissue, mRNA of about 
7 kb was detected as well as mRNA of about 1 5 kb which was considered to result from alternative splicing or polyade- 
nylation. The size of the mRNA of about 7 kb coincided with that of the rabbit cDNA of the present invention. Also, in 
55 human liver tissue and brain tissue, it was confirmed that mRNA having the same size was expressed. 
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Example 4: 

Screening of human brain cDNA library for positive clones and determination of the nucleotide sequence of cDNA frag- 
ments 

5 

The human brain cDNA library used in this Example was a commercially obtained cDNA library which was con- 
structed using ^gt10 as a vector. Using partial cDNA of the present invention as a probe, screening of the cDNA library 
(300,000 plaques) was performed using a plaque hybridization method. Procedures of screening, cloning, and 
sequencing were as described in Example 2 of the present invention. 
10 As a result of screening of the human brain cDNA library, positive clones containing a DNA fragment of about 3 kb 
were obtained. Analysis of the nucleotide sequence of part of the cDNA fragment revealed that the fragment was highly 
homologous to the cDNA of the present invention (Sequence ID No. 4). 

Example 5: 

15 

Cloning of cDNA of receptors in the human LDL receptor family: 

A human brain cDNA library was subjected to screening using fragments of the cDNA of the present invention and 
fragments of the cDNA obtained in Example 4 as probes. Procedures of screening, cloning, and sequencing were as 

20 described in Example 2 of the present invention. 

Through screening of the human brain cDNA library, two positive clones containing cDNA fragments of about 6 kb 
and about 3 kb were obtained. When their nucleotide sequence was analyzed, they were identified to be a cDNA clone 
containing the cDNA nucleotide sequence obtained in Example 4 and a cDNA clone that overlapped therewith. Using 
part of these cDNAs as probes, procedures similar to those as described above were performed, to thereby obtain 

25 another cDNA clone The DNA nucleotide sequence indicated by these cDNA clones are shown as Sequence ID No. 
7. The total length of the sequence was 6,843 bp. There was an open reading frame having a size of 6,642 bp 
(Sequence ID No 5). A putative amino acid sequence is shown as Sequence ID No. 6. The protein consisted of 2,214 
amino acids. Comparison of the amino acid sequence with that of rabbit protein shown by Sequence ID No. 2 revealed 
high homology of not less than 94%. 

30 

Example 6: 

Creation of cells that express receptors in the rabbit LDL receptor family: 

35 The cDNA as shown by Sequence ID No. 3 was ligated to phosphorylated EcoRI linker pd (CCGAATTCGG) by the 
use of a T4 DNA ligase. and the resultant ligated product was digested with EcoRI. Separately, a vector for expression, 
pBK-CMV was digested with EcoRI. The aforementioned DNA was ligated to the EcoRI-digested site of the vector using 
a T4 DNA ligase. 

Using the resultant recombinant expression vector in a calcium phosphate method [Chen, C. and H. Okayama 
40 (1987) Mol. Cell. Biol. 7, 2745-2752], host cells (CHO-1d1 A7) were transformed. The resultant transformants were incu- 
bated in a Ham's F-12 selective medium supplemented with 500 fig/ml of G418, and viable cells were separated as LDL 
receptor analog protein -expressing cells. The cells were incubated further in the aforementioned medium. 

Example 7: 

45 

Ligand analysis of the LDL receptor analog protein by ligand blotting: 

The obtained LDL receptor analog protein-expressing cells and control cells were suspended in a buffer solution 
containing 200 mM Tris-maleic acid (pH 6.5), 2 mM calcium chloride, 0.5 mM PMSF, 2.5 \xM leupeptin, and 1% Triton 
so X-1 00, to thereby solubilize the membrane protein. Solubilized membrane protein fractions were obtained through cen- 
trifugation, and electrophoresed by a 4.5-1 8% gradient SDS-PAGE. Thereafter, the protein was transferred onto a nitro- 
cellulose membrane. 

Incubation was performed in a buffer of 50 mM Tris-HCI (pH 8.0) containing 125 l-labelled p-VLDL (10 jig/ml), 2 mM 
calcium chloride, and 5% bovine serum albumin. Autoradiography was performed at room temperature. 
55 A single band of about 250 kDa was detected in membrane protein fractions prepared using the present protein- 
expressing cells. This size coincided well with the molecular weight of 248 kDa calculated regarding the amino acid 
sequence (Sequence ID No. 2) deduced from the cDNA of the present invention. Although a similar band was detected 
for control cells, the expression level was much lower as compared with the case of the present protein-expressing cells. 

Since the protein coded by the cDNA of the present invention is considered to be a novel LDL receptor family recep- 
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tor. it is expected that through analyses of this protein, details of lipoprotein metabolism mediated by the membrane 
receptor will be elucidated, and pathology of abnormal lipid metabolism which triggers onset and progress of arterio- 
sclerosis will be clarified. 

Sequence ID No. 1 

Length of the Sequence: 6639 

Type: nucleic acid 

Strandedness: double 

Topology: linear 

Molecular type: cDNA to mRNA 

Sequence: 

ATGGCGACAC GGAGCAGCAG GAGGGAGTCG CGACTCCCCT TCCTATTCAC CCTGGTCGCG 60 

CTGCTGCCGC CCGGGGCTCT CTGCGAGGTG TGGACGCGGA CACTGCACGG CGGCCGCGCG 120 

CCCTTACCCC AGGAGCGGGG CTTCCGCGTG GTGCAGGGCG ACCCGCGCGA GCTGCGGCTG 180 

TGGGAGCGCG GGGATGCCAG GGGGGCGAGC CGGGCGGACG AGAAGCCGCT CCGGAGGAGA 240 

CGGAGCGCTG CCCTGCAGCC CGAGCCCATC AAGGTGTACG GACAGGTCAG CCTCAATGAT 300 

TCCCACAATC AGATGGTGGT GCACTGGGCC GGAGAGAAAA GCAACGTGAT CGTGGCCTTG 360 

GCCCGGGACA GCCTGGCGTT GGCCAGGCCC AGGAGCAGTG ATGTGTACGT GTCTTATGAC 420 

TATGGAAAAT CATTCAATAA GATTTCAGAG AAATTGAACT TCGGCGCGGG AAATAACACA 480 

GAGGCTGTGG TGGCCCAGTT CTACCACAGC CCTGCGGACA ACAAACGGTA CATCTTCGCA 540 

GATGCCTACG CCCAGTATCT CTGGATCACG TTTGACTTCT GCAACACCAT CCATGGCTTT 600 

TCCATCCCGT TCCGGGCAGC TGATCTCCTA CTCCACAGTA AGGCCTCCAA CCTTCTCCTG 660 

GGCTTCGACA GGTCTCACCC CAACAAGCAG CTGTGGAAGT CGGATGATTT TGGCCAGACC 720 

TGGATCATGA TTCAAGAACA CGTGAAGTCC TTTTCTTGGG GAATTGATCC CTATGACAAA 780 

CCAAACACCA TCTACATCGA ACGGCACGAA CCTTCTGGCT ACTCCACGGT TTTCCGAAGT 840 

ACAGACTTCT TCCAGTCCCG GGAAAACCAG GAAGTGATCT TGGAGGAAGT GAGAGACTTT 900 

CAGCTTCGGG ACAAGTACAT GTTTGCTACA AAGGTGGTGC ATCTCTTGGG CAGTCCACTG 960 

CAGTCTTCTG TCCAGCTCTG GGTCTCCTTT GGCCGGAAGC CCATGCGGGC CGCCCAGTTT 1020 

GTTACAAGAC ATCCTATCAA CGAATATTAC ATCGCGGATG CCTCGGAGGA CCAGGTGTTT 1080 

GTGTGTGTCA GTCACAGCAA CAACCGCACC AACCTCTACA TCTCGGAGGC AGAGGGCTTG 1 140 

AAGTTCTCTC TGTCCCTGGA GAACGTGCTC TACTACACCC CGGGAGGGGC CGGCAGTGAC 1200 

ACCTTGGTGA GGTACTTTGC AAATGAACCG TTTGCTGACT TCCATCGTGT GGAAGGGTTG 1260 
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CAGGGAGTCT 


ACATTGCTAC TCTGATTAAT GGTTCTATGA ATGAGGAGAA 


CATGAGATCT 


1320 


GTCATCACCT 


TTGACAAAGG GGGCACCTGG GAATTTCTGC AGGCTCCAGC 


CTTCACGGGG 


1380 


TATGGAGAGA 


AAATCAACTG TGAGCTGTCC GAGGGCTGTT CCCTCCACCT 


GGCCCAGCGC 


1440 


CTCAGCCAGC 


TGCTCAACCT CCAGCTCCGG AGGATGCCCA TCCTGTCCAA 


GGAGTCGGCG 


1500 


CCTGGCCTCA 


TCATTGCCAC GGGCTCAGTG GGAAAGAACT TGGCTAGCAA 


GACAAACGTG 


1560 


TACATCTCTA 


GCAGTGCTGG AGCCAGGTGG CGAGAGGCAC TTCCTGGACC 


TCACTACTAT 


1620 


ACATGGGGAG 


ACCATGGCGG CATCATCATG GCCATTGCCC AAGGCATGGA 


AACCAACGAA 


1680 


CTGAAGTACA 


GTACCAACGA AGGGGAGACC TGGAAAGCCT TCACCTTCTC 


TGAGAAGCCC 


1740 


GTGTTTGTGT 


ATGGGCTCCT CACGGAACCC GGCGAGAAGA GCACGGTCTT 


CACCATCTTT 


1800 


GGCTCCAACA 


AGGAGAACGT GCACAGCTGG CTCATCCTCC AGGTCAATGC 


CACAGACGCC 


1860 


CTGGGGGTTC 


CTTGCACAGA GAACGACTAC AAGCTCTGGT CACCATCTGA 


TGAGCGGGGG 


1920 


AATGAGTGTT 


TGCTTGGACA CAAGACTGTT TTCAAACGGA GGACCCCGCA 


CGCCACATGC 


1980 


TTTAACGGAG 


AAGACTTTGA CAGGCCGGTG GTTGTGTCCA ACTGCTCCTG 


CACCCGGGAG 


2040 


GACTATGAGT 


GTGACTTTGG CTTCCGGATG AGTGAAGACT TGGCATTAGA 


GGTGTGTGTT 


2100 


CCAGATCCAG 


GATTTTCTGG AAAGTCCTCC CCTCCAGTGC CTTGTCCCGT 


GGGCTCTACG 


2160 


TACAGGCGAT 


CAAGAGGCTA CCGGAAGATT TCTGGGGACA CCTGTAGTGG 


AGGAGATGTT 


2220 


GAGGCACGGC 


TAGAAGGAGA GCTGGTCCCC TGTCCCCTGG CAGAAGAGAA 


CGAGTTCATC 


2280 


CTGTACGCCA 


CGCGCAAGTC CATCCACCGC TATGACCTGG CTTCCGGAAC 


CACGGAGCAG 


2340 


TTGCCCCTCA 


CTGGGTTGCG GGCAGCAGTG GCCCTGGACT TTGACTATGA 


GCACAACTGC 


2400 


CTGTATTGGT 


CTGACCTGGC CTTGGACGTC ATCCAGCGCC TCTGTTTGAA 


CGGGAGTACA 


2460 


GGACAAGAGG 


TGATCATCAA CTCTGACCTG GAGACGGTAG AAGCTTTGGC 


TTTTGAACCC 


2520 


CTCAGCCAAT 


TACTTTACTG GGTGGACGCA GGCTTTAAAA AGATCGAGGT 


AGCCAATCCA 


2580 


GATGGTGACT 


TCCGACTCAC CGTCGTCAAT TCCTCGGTGC TGGATCGGCC 


CCGGGCCCTG 


2640 


GTCCTTGTGC 


CCCAAGAAGG GATCATGTTC TGGACCGACT GGGGAGACCT 


GAAGCCTGGG 


2700 


ATTTATCGGA 


GCAACATGGA CGGATCTGCC GCCTATCGCC TCGTGTCGGA 


GGATGTGAAG 


2760 


TGGCCCAATG 


GCATTTCCGT GGACGATCAG TGGATCTACT GGACGGATGC 


CTACCTGGAC 


2820 


TGCATTGAGC 


GCATCACGTT CAGCGGCCAG CAGCGCTCCG TCATCCTGGA 


CAGACTCCCG 


2880 


CACCCCTATG 


CCATTGCTGT CTTTAAGAAT GAGATTTACT GGGATGACTG 


GTCACAGCTC 


2940 
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AGCATATTCC GAGCTTCTAA GTACAGCGGG TCCCAGATGG AGATTCTGGC CAGCCAGCTC 3000 
ACGGGGCTGA TGGACATGAA GATCTTCTAC AAGGGGAAGA ACACAGGAAG CAATGCGTGT 3060 
GTACCCAGGC CGTGCAGCCT GCTGTGCCTG CCCAGAGCCA ACAACAGCAA AAGCTGCAGG 3120 
TGTCCAGATG GCGTGGCCAG CAGTGTCCTC CCTTCCGGGG ACCTGATGTG TGACTGCCCT 3180 
AAGGGCTACG AGCTGAAGAA CAACACGTGT GTCAAAGAAG AAGACACCTG TCTGCGCAAC 3240 
CAGTACCGCT GCAGCAACGG GAACTGCATC AACAGCATCT GGTGGTGCGA TTTCGACAAC 3300 
GACTGCGGAG ACATGAGCGA CGAGAAGAAC TGCCCTACCA CCATCTGCGA CCTGGACACC 3360 
CAGTTCCGTT GCCAGGAGTC TGGGACGTGC ATCCCGCTCT CCTACAAATG TGACCTCGAG 3420 
GATGACTGTG GGGACAACAG TGACGAAAGG CACTGTGAAA TGCACCAGTG CCGGAGCGAC 3480 
GAATACAACT GCAGCTCGGG CATGTGCATC CGCTCCTCCT GGGTGTGCGA CGGGGACAAC 3540 
GACTGCAGGG ACTGGTCCGA CGAGGCCAAC TGCACAGCCA TCTATCACAC CTGTGAGGCC 3600 
TCCAACTTCC AGTGCCGCAA CGGGCACTGC ATCCCCCAGC GGTGGGCGTG TGACGGCGAC 3660 
GCCGACTGCC AGGATGGCTC TGATGAGGAT CCAGCCAACT GTGAGAAGAA GTGCAACGGC 3720 
TTCCGCTGCC CGAACGGCAC CTGCATTCCC TCCACCAAGC ACTGTGACGG CCTGCACGAT 3780 
TGCTCGGACG GCTCCGACGA GCAGCACTGC GAGCCCCTGT GTACACGGTT CATGGACTTC 3840 
GTGTGTAAGA ACCGCCAGCA GTGCCTCTTC CACTCCATGG TGTGCGATGG GATCATCCAG 3900 
TGCCGTGACG GCTCCGACGA GGACCCAGCC TTTGCAGGAT GCTCCCGAGA CCCCGAGTTC 3960 
CACAAGGTGT GCGATGAGTT CGGCTTCCAG TGTCAGAACG GCGTGTGCAT CAGCTTGATC 4020 
TGGAAGTGCG ACGGGATGGA TGACTGCGGG GACTACTCCG ACGAGGCCAA CTGTGAAAAC 4080 
CCCACAGAAG CCCCCAACTG CTCCCGCTAC TTCCAGTTCC GGTGTGACAA TGGCCACTGC 4140 
ATCCCCAACA GGTGGAAGTG TGACAGGGAG AATGACTGTG GGGACTGGTC CGACGAGAAG 4200 
GACTGTGGAG ATTCACATGT ACTTCCGTCT ACGACTCCTG CACCCTCCAC GTGTCTGCCC 4260 
AATTACTACC GCTGCGGCGG GGGGGCCTGC GTGATAGACA CGTGGGTTTG TGACGGGTAC 4320 
CGAGATTGCG CAGATGGATC CGACGAGGAA GCCTGCCCCT CGCTCCCCAA TGTCACTGCC 4380 
ACCTCCTCCC CCTCCCAGCC TGGACGATGC GACCGATTTG AGTTTGAGTG CCACCAGCCA 4440 
AAGAAGTGCA TCCCTAACTG GAGACGCTGT GACGGCCATC AGGATTGCCA GGATGGCCAG 4500 
GACGAGGCCA ACTGCCCCAC TCACAGCACC TTGACCTGCA TGAGCTGGGA GTTCAAGTGT 4560 
GAGGATGGCG AGGCCTGCAT CGTGCTGTCA GAACGCTGCG ACGGCTTCCT GGACTGCTCA 4620 
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GATGAGAGCG 


ACGAGAAGGC 


CTGCAGTGAT GAGTTAACTG TATACAAAGT ACAGAATCTT 


4680 


CAGTGGACAG 


CTGACTTCTC 


TGGGAATGTC ACTTTGACCT GGATGCGGCC CAAAAAAATG 


4740 


CCCTCTGCTG 


CTTGTGTATA 


CAACGTGTAC TATAGAGTTG TTGGAGAGAG CATATGGAAG 


4800 


ACTCTGGAGA 


CTCACAGCAA 


TAAGACAAAC ACTGTATTAA AAGTGTTGAA ACCAGATACC 


4860 


ACCTACCAGG 


TTAAAGTGCA 


GGTTCAGTGC CTGAGCAAGG TGCACAACAC CAATGACTTT 


4920 


GTGACCTTGA 


GAACTCCAGA 


GGGATTGCCA GACGCCCCfC AGAACCTCCA GCTGTCGCTC 


4980 


CACGGGGAAG 


AGGAAGGTGT 


GATTGTGGGC CACTGGAGCC CTCCCACCCA CACCCACGGC 


5040 


CTCATTCGCG 


AATACATTGT 


AGAGTATAGC AGGAGTGGTT CCAAGGTGTG GACTTCAGAA 


5100 


AGGGCTGCTA 


GTAACTTTAC 


AGAAATAAAG AACTTGTTGG TCAACACCCT GTACACCGTC 


5160 


AGAGTGGCTG 


CGGTGACGAG 


TCGTGGGATA GGAAACTGGA GCGATTCCAA ATCCATTACC 


5220 


ACCGTGAAAG 


GAAAAGCGAT 


CCCGCCACCA AATATCCACA TTGACAACTA CGATGAAAAT 


5280 


TCCCTGAGTT 


TTACCCTGAC 


CGTGGATGGG AACATCAAGG TGAATGGCTA TGTGGTGAAC 


5340 


CTTTTCTGGG 


CATTTGACAC 


CCACAAACAA GAGAAGAAAA CCATGAACTT CCAAGGGAGC 


5400 


TCAGTGTCCC 


ACAAAGTTGG 


CAATCTGACA GCACAGACGG CCTATGAGAT TTCCGCCTGG 


5460 


GCCAAGACTG 


ACTTGGGCGA 


TAGTCCTCTG TCATTTGAGC ATGTCACGAC CAGAGGGGTT 


5520 


CGCCCACCTG 


CTCCTAGCCT 


CAAGGCCAGG GCTATCAATC AGACTGCAGT GGAATGCACC 


5580 


TGGACAGGCC 


CCAGGAATGT 


GGTGTATGGC ATTTTCTATG CCACATCCTT CCTGGACCTC 


5640 


TACCGCAACC 


CAAGCAGCCT 


GACCACGCCG CTGCACAACG CAACCGTGCT CGTCGGTAAG 


5700 


GATGAGCAGT 


ATCTGTTTCT 


GGTCCGGGTG GTGATGCCCT ACCAAGGGCC GTCCTCGGAC 


5760 


TACGTGGTCG 


TGAAGATGAT 


CCCGGACAGC AGGCTTCCTC CCCGGCACCT GCATGCCGTT 


5820 


CACACCGGCA 


AGACCTCGGC 


CGTCATCAAG TGGGAGTCGC CCTACGACTC TCCTGACCAG 


5880 


GACCTGTTCT 


ATGCGATCGC 


AGTTAAAGAT CTGATACGAA AGACGGACCG GAGCTACAAA 


5940 


GTCAAGTCCC 


GCAACAGCAC 


CGTGGAGTAC ACCCTGAGCA AGCTGGAGCC CGGAGGGAAA 


6000 


TACCACGTCA 


TTGTGCAGCT 


GGGGAACATG AGCAAAGATG CCAGTGTGAA GATCACCACC 


6060 


GTTTCGTTAT 


CGGCACCCGA 


TGCCTTAAAA ATCATAACAG AAAATGACCA CGTCCTTCTC 


6120 


TTCTGGAAAA 


GTCTAGCTCT 


AAAGGAAAAG TATTTTAACG AAAGCAGGGG CTACGAGATA 


6180 


CACATGTTTG 


ATAGCGCCAT 


GAATATCACC GCATACCTTG GGAATACTAC TGACAATTTC 


6240 


TTTAAAATTT 


CCAACCTGAA 


GATGGGTCAC AATTACACAT TCACGGTCCA GGCACGATGC 


6300 
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CTTTTGGGCA GCCAGATCTG CGGGGAGCCT GCCGTGCTAC TGTATGATGA GCTGGGGTCT 6360 
GGTGGCGATG CGTCGGCGAT GCAGGCTGCC AGGTCTACTG ATGTCGCCGC CGTGGTGGTG 6420 
CCCATCCTGT TTCTGATACT GCTGAGCCTG GGGGTCGGGT TTGCCATCCT GTACACGAAG 6480 
CATCGGAGGC TGCAGAGCAG CTTCACCGCC TTCGCCAACA GCCACTACAG CTCCAGACTC 6540 
GGCTCCGCCA TCTTCTCCTC TGGGGATGAC TTGGGGGAGG ATGATGAAGA TGCTCCTATG 6600 
ATCACTGGAT TTTCGGACGA CGTCCCCATG GTGATAGCC 6639 
Sequence ID No. 2 
Length of the Sequence: 2213 
Type: amino acid 
Topology: linear 
Molecular type: Protein 
Sequence : 

Met Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe 

5 10 15 

Thr Leu Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr 

20 25 30 

Arg Thr Leu His Gly Gly Arg Ala Pro Leu Pro Gin Glu Arg Gly Phe 

35 40 45 

Arg Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Glu Arg Gly 

50 55 60 

Asp Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Arg 
65 70 75 80 

Arg Ser Ala Ala Leu Gin Pro Glu Pro He Lys Val Tyr Gly Gin Val 

85 90 95 

Ser Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu 

100 105 HO 

Lys Ser Asn Val He Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala 
115 120 125 
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Arg Pro Arg Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser 

130 135 140 

Phe Asn Lys lie Ser Glu Lys Leu Asn Phe Gly Ala Gly Asn Asn Thr 
145 150 155 160 

Glu Ala Val Val Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg 

165 170 175 

Tyr He Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp He Thr Phe Asp 

180 185 190 

Phe Cys Asn Thr He His Gly Phe Ser He Pro Phe Arg Ala Ala Asp 

195 200 205 

Leu Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg 

210 215 220 

Ser His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr 
225 230 235 240 

Trp He Met He Gin Glu His Val Lys Ser Phe Ser Trp Gly He Asp 

245 250 255 

Pro Tyr Asp Lys Pro Asn Thr He Tyr He Glu Arg His Glu Pro Ser 

260 265 270 

Gly Tyr Ser Thr Val Phe Arg Ser Thr Asp Phe Phe Gin Ser Arg Glu 

275 280 285 

Asn Gin Glu Val He Leu Glu Glu Val Arg Asp Phe Gin Leu Arg Asp 

290 295 300 

Lys Tyr Met Phe Ala Thr Lys Val Val His Leu Leu Gly Ser Pro Leu 
305 310 315 320 

Gin Ser Ser Val Gin Leu Trp Val Ser Phe Gly Arg Lys Pro Met Arg 

325 330 335 

Ala Ala Gin Phe Val Thr Arg His Pro He Asn Glu Tyr Tyr He Ala 
340 345 350 
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Asp Ala Ser Glu Asp Gin Val Phe Val Cys Val Ser His Ser Asn Asn 

355 360 365 

Arg Thr Asn Leu Tyr lie Ser Glu Ala Glu Gly Leu Lys Phe Ser Leu 

370 375 380 

Ser Leu Glu Asn Val Leu Tyr Tyr Thr Pro Gly Gly Ala Gly Ser Asp 
385 390 395 400 

Thr Leu Val Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp Phe His Arg 

405 410 415 

Val Clu Gly Leu Gin Gly Val Tyr He Ala Thr Leu He Asn Gly Ser 

420 425 430 

Met Asn Glu Glu Asn Met Arg Ser Val lie Thr Phe Asp Lys Gly Gly 

435 440 445 

Thr Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys 

450 455 460 

He Asn Cys Glu Leu Ser Glu Gly Cys Ser Leu His Leu Ala Gin Arg 
465 470 475 480 

Leu Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro He Leu Ser 

485 490 495 

Lys Glu Ser Ala Pro Gly Leu He He Ala Thr Gly Ser Val Gly Lys 

500 505 510 

Asn Leu Ala Ser Lys Thr Asn Val Tyr lie Ser Ser Ser Ala Gly Ala 

515 520 525 

Arg Trp Arg Glu Ala Leu Pro Gly Pro His Tyr Tyr Thr Trp Gly Asp 

530 535 540 

His Gly Gly lie lie Met Ala lie Ala Gin Gly Met Glu Thr Asn Glu 
545 550 555 560 

Leu Lys Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Ala Phe Thr Phe 
565 570 575 
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Ser Glu Lys Pro Val Phe Val Tyr Gly Leu Leu Thr G!u Pro Gly Glu 

-80 585 590 

Lys Ser Thr Val Phe Thr He Phe Gly Ser Asn Lys Glu Asn Val His 

595 600 605 

Ser Trp Leu He Leu Gin Val Asn Ala Thr Asp Ala Leu Gly Val Pro 

610 615 620 

Cys Thr Glu Asn Asp Tyr Lys Leu Trp Ser Pro Ser Asp Glu Arg Gly 
625 630 635 . 640 

Asn Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Arg Thr Pro 

645 650 655 

His Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val 

660 665 670 

Ser Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe 

675 680 685 

Arg Met Ser Glu Asp Leu Ala Leu Glu Val Cys Val Pro Asp Pro Gly 

690 695 700 

Phe Ser Gly Lys Ser Ser Pro Pro Val Pro Cys Pro Val Gly Ser Thr 
705 710 ' 715 720 

Tyr Arg Arg Ser Arg Gly Tyr Arg Lys He Ser Gly Asp Thr Cys Ser 

725 730 735 

Gly Gly Asp Val Glu Ala Arg Leu Glu Gly Glu Leu Val Pro Cys Pro 

740 745 750 

Leu Ala Glu Glu Asn Glu Phe He Leu Tyr Ala Thr Arg Lys Ser lie 

755 760 765 

His Arg Tyr Asp Leu Ala Ser Gly Thr Thr Glu Gin Leu Pro Leu Thr 

770 775 780 

Gly Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn Cys 
785 790 795 800 
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Leu Tyr Trp Ser Asp Leu Ala Leu Asp Val He Gin Arg Leu Cys Leu 

805 810 815 

Asn Gly Ser Thr Gly Gin Glu Val He He Asn Ser Asp Leu Glu Thr 

820 825 830 

Val Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp Val 

835 840 845 

Asp Ala Gly Phe Lys Lys He Glu Val Ala Asn Pro Asp Gly Asp Phe 

850 855 860 

Arg Leu Thr Val Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala Leu 
865 870 875 880 

Val Leu Val Pro Gin G!u Gly He Met Phe Trp Thr Asp Trp Gly Asp 

885 890 895 

Leu Lys Pro Gly lie Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala Tyr 

900 905 910 

Arg Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly He Ser Val Asp 

915 920 925 

Asp Gin Trp He Tyr Trp Thr Asp Ala Tyr Leu Asp Cys He Glu Arg 

930 935 940 

lie Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Arg Leu Pro 
945 950 955 960 

His Pro Tyr Ala He Ala Val Phe Lys Asn Glu lie Tyr Trp Asp Asp 

965 970 975 

Trp Ser Gin Leu Ser He Phe Arg Ala Ser Lys Tyr Ser Gly Ser Gin 

980 985 990 

Met Glu He Leu Ala Ser Gin Leu Thr Gly Leu Met Asp Met Lys He 

995 1000 1005 

Phe Tyr Lys Gly Lys Asn Thr Gly Ser Asn Ala Cys Val Pro krg Pro 
1010 1015 1020 
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Cys Ser Leu Leu Cys Leu Pro Arg Ala Asn Asn Ser Lys Ser Cys Arg 
1025 1030 1035 1040 

Cys Pro Asp Gly Val Ala Ser Ser Val Leu Pro Ser Gly Asp Leu Met 

!045 1050 1055 

Cys Asp Cys Pro Lys Gly Tyr Glu Leu Lys Asn Asn Thr Cys Val Lys 

1060 1065 1070 

Glu Glu Asp Thr Cys Leu Arg Asn Gin Tyr Arg Cys Ser Asn Gly Asn 

1075 1080 1085 

Cys lie Asn Ser He Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly Asp 

1090 1095 1100 

Met Ser Asp Glu Lys Asn Cys Pro Thr Thr He Cys Asp Leu Asp Thr 
1105 mo 1115 n20 

Gin Phe Arg Cys Gin Glu Ser Gly Thr Cys lie Pro Leu Ser Tyr Lys 

U25 1130 H35 

Cys Asp Leu Glu Asp Asp Cys Gly Asp Asn Ser Asp Glu Arg His Cys 

1140 1145 H50 

Glu Met His Gin Cys Arg Ser Asp Glu Tyr Asn Cys Ser Ser Gly Met 

1155 1160 H65 

Cys lie Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg Asp 

1175 H80 
Trp Ser Asp Glu Ala Asn Cys Thr Ala He Tyr His Thr Cys Glu Ala 
1185 H90 1195 1200 

Ser Asn Phe Gin Cys Arg Asn Gly His Cys He Pro Gin Arg Trp Ala 

1205 1210 1215 

Cys Asp Gly Asp Ala Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro Ala 

1220 1225 1230 

Asn Cys Glu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Gly Thr Cys 
123 5 1240 1245 
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He Pro Ser Thr Lys His Cys Asp Gly Leu His Asp Cys Ser Asp Gly 

1250 1255 1260 

Ser Asp Glu Gin His Cys Glu Pro Leu Cys Thr Arg Phe Met Asp Phe 
1265 1270 1275 1280 

Val Cys Lys Asn Arg Gin Gin Cys Leu Phe His Ser Met Val Cys Asp 

1285 1290 1295 

Gly He lie Gin Cys Arg Asp Gly Ser Asp Glu Asp Pro Ala Phe Ala 

1300 1305 1310 

Gly Cys Ser Arg Asp Pro Glu Phe His Lys Val Cys Asp Glu Phe Gly 

1315 1320 1325 

Phe Gin Cys Gin Asn Gly Val Cys He Ser Leu He Trp Lys Cys Asp 

1330 1335 1340 

Gly Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu Asn 
1345 1350 1355 1360 

Pro Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys Asp 

1365 1370 1375 

Asn Gly His Cys He Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn Asp 

1380 1385 1390 

Cys Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His Val Leu 

1395 1400 1405 

Pro Ser Thr Thr Pro Ala Pro Ser Thr Cys Leu Pro Asn Tyr Tyr Arg 

1410 1415 1420 

Cys Gly Gly Gly Ala Cys Val He Asp Thr Trp Val Cys Asp Gly Tyr 
1425 1430 1435 1440 

Arg Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Ser Leu Pro 

1445 1450 1455 

Asn Val Thr Ala Thr Ser Ser Pro Ser Gin Pro Gly Arg Cys Asp Arg 
1460 1465 1470 
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Phe Glu Phe Glu Cys His Gin Pro Lys Lys Cys He Pro Asn Trp Arg 

1475 1480 1485 

Arg Cys Asp Gly His Gin Asp Cys Gin Asp Gly Gin Asp Glu Ala Asn 

1490 1495 1500 

Cys Pro Thr His Ser Thr Leu Thr Cys Met Ser Trp Glu Phe Lys Cys 
1505 1510 1515 1520 

Glu Asp Gly Glu Ala Cys He Val Leu Ser Glu Arg Cys Asp Gly Phe 

1525 1530 1535 

Leu Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu Leu 

1540 1545 1550 

Thr Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser Gly 

1555 1560 1565 

Asn Va! Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala Ala 

1570 1575 1580 

Cys Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp Lys 

1585 1590 1595 1600 

Thr Leu Glu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val Leu 

1605 1610 1615 

Lys Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val Gin Cys Leu Ser 

1620 1625 1630 

Lys Val His Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu Gly 

1635 1640 1645 

Leu Pro Asp Ala Pro Gin Asn Leu Gin Leu Ser Leu His Gly Glu Glu 

1650 1655 1660 

Glu Gly Val He Val Gly His Trp Ser Pro Pro Thr His Thr His Gly 
1665 1670 1675 1680 

Leu lie Arg Glu Tyr He Val Glu Tyr Ser Arg Ser Gly Ser Lys Val 

1685 1690 1695 
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Trp Thr Ser Glu Arg Ala Ala Ser Asn Phe Thr Glu He Lys Asn Leu 

,1700 1705 1710 

Leu Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser Arg 

1715 1720 1725 

Gly He Gly Asn Trp Ser Asp Ser Lys Ser He Thr Thr Val Lys Gly 

1730 1735 1740 

Lys Ala He Pro Pro Pro Asn lie His He Asp Asn Tyr Asp Glu Asn 
1745 1750 1755 1760 

Ser Leu Ser Phe Thr Leu Thr Val Asp Gly Asn He Lys Val Asn Gly 

1765 1770 1775 

Tyr Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu Lys 

1780 1785 1790 

Lys Thr Met Asn Phe Gin Gly Ser Ser Val Ser His Lys Val Gly Asn 

1795 1800 1805 

Leu Thr Ala Gin Thr Ala Tyr Glu He Ser Ala Trp Ala Lys Thr Asp 

1810 1815 1820 

Leu Gly Asp Ser Pro Leu Ser Phe Glu His Val Thr Thr Arg Gly Val 
1825 1830 1835 1840 

Arg Pro Pro Ala Pro Ser Leu Lys Ala Arg Ala He Asn Gin Thr Ala 

1845 1850 1855 

Val Glu Cys Thr Trp Thr Gly Pro Arg Asn Val Val Tyr Gly He Phe 

1860 1865 1870 

Tyr Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Ser Ser Leu Thr 

1875 1880 1885 

Thr Pro Leu His Asn Ala Thr Val Leu Val Gly Lys Asp Glu Gin Tyr 

1890 1895 1900 

Leu Phe Leu Val Arg Val Val Met Pro Tyr Gin Gly Pro Ser Ser Asp 
1905 1910 1915 1920 
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Tyr Val Val Val Lys Met He Pro Asp Ser Arg Leu Pro Pro Arg His 

1925 1930 1935 

Leu His Ala Val His Thr Gly Lys Thr Ser Ala Val He Lys Trp Glu 

1940 1945 1950 

Ser Pro Tyr Asp Ser Pro Asp Gin Asp Leu Phe Tyr Ala He Ala Val 

1955 i960 1965 

Lys Asp Leu lie Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser Arg 

1970 1975 1980 

Asn Ser Thr Val Glu Tyr Thr Leu Ser Lys Leu Glu Pro Gly Gly Lys 
1985 1990 1995 2000 

Tyr His Val He Val Gin Leu Gly Asn Met Ser Lys Asp Ala Ser Val 

2005 2010 2015 

Lys He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys He He 

2020 2025 2030 

Thr Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu Lys 

2035 2040 2045 

Glu Lys Tyr Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe Asp 

2050 2055 2060 

Ser Ala Met Asn lie Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn Phe 
2065 2070 2075 2080 

Phe Lys He Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr Val 

2085 2090 2095 

Gin Ala Arg Cys Leu Leu Gly Ser Gin He Cys Gly Glu Pro Ala Val 

2100 2105 2110 

Leu Leu Tyr Asp Glu Leu Gly Ser Gly Gly Asp Ala Ser Ala Met Gin 

2115 2120 2125 

Ala Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Pro He Leu Phe 
2130 2135 2140 
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Leu He Leu Leu Ser Leu Gly Val Gly Phe Ala He Leu Tyr Thr Lys 
2145 2150 2155 2160 

His Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His Tyr 

2165 2170 2175 

Ser Ser Arg Leu Gly Ser Ala lie Phe Ser Ser Gly Asp Asp Leu Gly 

2180 2185 2190 

Glu Asp Asp Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp Val 

2195 2200 2205 

Pro Met Val He Ala 
2210 

Sequence ID No. 3 

Length of the Sequence: 6961 

Type: nucleic acid 

Strandedness : double 

Topology: linear 

Molecular type: cDNA to mRNA 

Feature: 

Name/Key: sig peptide 

Location: 178. .261 

Identification method: S 

Name/Key: mat peptide 

Location: 262. .6816 

Identification method: S 
Sequence : 

CCGCGAGCCG CACACGTGAC GGCGCCGCGC CGCGCCGCGC CGCGCCGAGC GGGACCCAGC 60 
GGCTGCCCGG AGCCCCGGGA GCGGCGCGCG CGCGGCCCCG GCCCCGCCGC TCGGCCGGCG 120 
GCGCGCTGCA CATTCTCTCC TGGCGGCGGC GCCACCTGCA GCCGCGTTCG CCCGAACATG 180 

Met 
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w 



20 



25 



30 



35 



40 



45 



50 



GCG ACA CGG AGC AGC AGG AGG GAG TCG CGA CTC CCC TTC CTA TTC ACC 228 
Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe Thr 

5 10 15 

CTG GTC GCG CTG CTG CCG CCC GGG GCT CTC TGC GAG GTG TGG ACG CGG . 276 
Leu Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr Arg 

20 25 30 

ACA CTG CAC GGC GGC CGC GCG CCC TTA CCC CAG GAG CGG GGC TTC CGC 324 
Thr Leu His Gly Gly Arg Ala Pro Leu Pro Gin Glu Arg Gly Phe Arg 

35 40 45 

GTG GTG CAG GGC GAC CCG CGC GAG CTG CGG CTG TGG GAG CGC GGG GAT 372 
Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Glu Arg Gly Asp 
50 55 60 65 

GCC AGG GGG GCG AGC CGG GCG GAC GAG AAG CCG CTC CGG AGG AGA CGG 420 
Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Arg Arg 

70 75 80 

AGC GCT GCC CTG CAG CCC GAG CCC ATC AAG GTG TAC GGA CAG GTC AGC 468 
Ser Ala Ala Leu Gin Pro Glu Pro lie Lys Val Tyr Gly Gin Val Ser 

85 90 95 

CTC AAT GAT TCC CAC AAT CAG ATG GTG GTG CAC TGG GCC GGA GAG AAA 516 
Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu Lys 

100 105 110 

AGC AAC GTG ATC GTG GCC TTG GCC CGG GAC AGC CTG GCG TTG GCC AGG 564 
Ser Asn Val lie Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala Arg 

115 120 125 

CCC AGG AGC AGT GAT GTG TAC GTG TCT TAT GAC TAT GGA AAA TCA TTC 612 
Pro Arg Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser Phe 
130 135 140 145 
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10 



15 



20 



25 



30 



35 



40 



45 



SO 



AAT AAG ATT TCA GAG AAA TTG AAC TTC GGC GCG GGA AAT AAC ACA GAG 660 
Asn Lys He Ser Glu Lys Leu Asn Phe Gly Ala Gly Asn Asn Thr Glu 

150 155 160 

GCT GTG GTG GCC CAG TTC TAC CAC AGC CCT GCG GAC AAC AAA CGG TAC 708 
Ala Val Val Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg Tyr 

165 170 175 

ATC TTC GCA GAT GCC TAC GCC CAG TAT CTC TGG ATC ACG TTT GAC TTC 756 
lie Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp He Thr Phe Asp Phe 

180 185 190 

TGC AAC ACC ATC CAT GGC TTT TCC ATC CCG TTC CGG GCA GCT GAT CTC 804 
Cys Asn Thr He His Gly Phe Ser He Pro Phe Arg Ala Ala Asp Leu 

195 200 205 

CTA CTC CAC AGT AAG GCC TCC AAC CTT CTC CTG GGC TTC GAC AGG TCT 852 
Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg Ser 
210 215 220 225 

CAC CCC AAC AAG CAG CTG TGG AAG TCG GAT GAT TTT GGC CAG ACC TGG 900 
His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr Trp 

230 235 240 

ATC ATG ATT CAA GAA CAC GTG AAG TCC TTT TCT TGG GGA ATT GAT CCC 948 

He Met He Gin Glu His Val Lys Ser Phe Ser Trp Gly lie Asp Pro 

245 250 255 

TAT GAC AAA CCA AAC ACC ATC TAC ATC GAA CGG CAC GAA CCT TCT GGC 996 

Tyr Asp Lys Pro Asn Thr lie Tyr He Glu Arg His Glu Pro Ser Gly 

260 265 270 

TAC TCC ACG GTT TTC CGA AGT ACA GAC TTC TTC CAG TCC CGG GAA AAC 1044 

Tyr Ser Thr Val Phe Arg Ser Thr Asp Phe Phe Gin Ser Arg Glu Asn 

275 280 285 

CAG GAA GTG ATC TTG GAG GAA GTG AGA GAC TTT CAG CTT CGG GAC AAG 1092 



55 



23 



EP0 773 



290 A2 



10 



15 



20 



25 



30 



35 



40 



45 



SO 



Gin Glu Val He Leu Glu Glu Val Arg Asp Phe Gin Leu Arg Asp Lys 

290 295 300 305 

TAC ATG TTT GCT ACA AAG GTG GTG CAT CTC TTG GGC AGT CCA CTG CAG 1140 

Tyr Met Phe Ala Thr Lys Val Val His Leu Leu Gly Ser Pro Leu Gin 

310 315 320 

TCT TCT GTC CAG CTC TGG GTC TCC TTT GGC CGG AAG CCC ATG CGG GCC 1188 

Ser Ser Val Gin Leu Trp Val Ser Phe Gly Arg Lys Pro Met Arg Ala 

325 330 335 

GCC CAG TTT GTT ACA AGA CAT CCT ATC AAC GAA TAT TAC ATC GCG GAT 1236 

Ala Gin Phe Val Thr Arg His Pro He Asn Glu Tyr Tyr He Ala Asp 

340 345 350 

GCC TCG GAG GAC CAG GTG TTT GTG TGT GTC AGT CAC AGC AAC AAC CGC 1284 

Ala Ser Glu Asp Gin Val Phe Val Cys Val Ser His Ser Asn Asn Arg 

355 360 365 

ACC AAC CTC TAC ATC TCG GAG GCA GAG GGC TTG AAG TTC TCT CTG TCC 1332 

Thr Asn Leu Tyr He Ser Glu Ala Glu Gly Leu Lys Phe Ser Leu Ser 
370 375 380 385 

CTG GAG AAC GTG CTC TAC TAC ACC CCG GGA GGG GCC GGC AGT GAC ACC 1380 

Leu Glu Asn Val Leu Tyr Tyr Thr Pro Gly Gly Ala Gly Ser Asp Thr 

390 395 400 

TTG GTG AGG TAC TTT GCA AAT GAA CCG TTT GCT GAC TTC CAT CGT GTG 1428 

Leu Val Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp Phe His Arg Val 

405 410 415 

GAA GGG TTG CAG GGA GTC TAC ATT GCT ACT CTG ATT AAT GGT TCT ATG 1476 

Glu Gly Leu Gin Gly Val Tyr He Ala Thr Leu He Asn Gly Ser Met 

420 425 430 

AAT GAG GAG AAC ATG AGA TCT GTC ATC ACC TTT GAC AAA GGG GGC ACC 1524 

Asn Glu Glu Asn Met Arg Ser Val He Thr Phe Asp Lys Gly Gly Thr 
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435 440 445 

TGG GAA TTT CTG CAG GCT CCA GCC TTC ACG GGG TAT GGA GAG AAA ATC 1572 
Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys Me 
450 455 460 465 

AAC TGT GAG CTG TCC GAG GGC TGT TCC CTC CAC CTG GCC CAG CGC CTC 1620 
Asn Cys Glu Leu Ser Glu Gly Cys Ser Leu His Leu Ala Gin Arg Leu 

470 475 480 

AGC CAG CTG CTC AAC CTC CAG CTC CGG AGG ATG CCC ATC CTG TCC A AG 1668 
Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro lie Leu Ser Lys 

485 490 495 

GAG TCG GCG CCT GGC CTC ATC ATT GCC ACG GGC TCA GTG GGA AAG AAC 1716 
Glu Ser Ala Pro Gly Leu He He Ala Thr Gly Ser Val Gly Lys Asn 

500 505 510 

TTG GCT AGC AAG ACA AAC GTG TAC ATC TCT AGC AGT GCT GGA GCC AGG 1764 
Leu Ala Ser Lys Thr Asn Val Tyr lie Ser Ser Ser Ala Gly Ala Arg 

515 520 525 

TGG CGA GAG GCA CTT CCT GGA CCT CAC TAC TAT ACA TGG GGA GAC CAT 1812 
Trp Arg Glu Ala Leu Pro Gly Pro His Tyr Tyr Thr Trp Gly Asp His 
530 535 540 545 

GGC GGC ATC ATC ATG GCC ATT GCC CAA GGC ATG GAA ACC AAC GAA CTG 1860 
Gly Gly He He Met Ala He Ala Gin Gly Met Glu Thr Asn Glu Leu 

550 555 560 

AAG TAC AGT ACC AAC GAA GGG GAG ACC TGG AAA GCC TTC ACC TTC TCT 1908 
Lys Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Ala Phe Thr Phe Ser 

565 570 575 

GAG AAG CCC GTG TTT GTG TAT GGG CTC CTC ACG GAA CCC GGC GAG AAG 1956 
Glu Lys Pro Val Phe Val Tyr Gly Leu Leu Thr Glu Pro Gly Glu Lys 
580 585 590 
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AGC ACG GTC TTC ACC ATC TTT GGC TCC AAC AAG GAG AAC GTG CAC AGC 2004 
Ser Thr Val Phe Thr He Phe Gly Ser Asn Lys Glu Asn Val His Ser 

595 600 605 

TGG CTC ATC CTC CAG GTC AAT GCC ACA GAC GCC CTG GGG GTT CCT TGC 2052 
Trp Leu He Leu Gin Val Asn Ala Thr Asp Ala Leu Gly Val Pro Cys 
610 615 620 625 

ACA GAG AAC GAC TAC AAG CTC TGG TCA CCA TCT GAT GAG CGG GGG AAT 2100 
Thr Glu Asn Asp Tyr Lys Leu Trp Ser Pro Ser Asp Glu Arg Gly Asn 

630 635 640 

GAG TGT TTG CTT GGA CAC AAG ACT GTT TTC AAA CGG AGG ACC CCG CAC 2148 
Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Arg Thr Pro His 

645 650 655 

GCC ACA TGC TTT AAC GGA GAA GAC TTT GAC AGG CCG GTG GTT GTG TCC 2196 
Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val Ser 

660 665 670 

AAC TGC TCC TGC ACC CGG GAG GAC TAT GAG TGT GAC TTT GGC TTC CGG 2244 
Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe Arg 

675 680 685 

ATG AGT GAA GAC TTG GCA TTA GAG GTG TGT GTT CCA GAT CCA GGA TTT 2292 
Met Ser Glu Asp Leu Ala Leu Glu Val Cys Val Pro Asp Pro Gly Phe 
690 695 700 705 

TCT GGA AAG TCC TCC CCT CCA GTG CCT TGT CCC GTG GGC TCT ACG TAC 2340 
Ser Gly Lys Ser Ser Pro Pro Val Pro Cys Pro Val Gly Ser Thr Tyr 

710 715 720 

AGG CGA TCA AGA GGC TAC CGG AAG ATT TCT GGG GAC ACC TGT AGT GGA 2388 
Arg Arg Ser Arg Gly Tyr Arg Lys He Ser Gly Asp Thr Cys Ser Gly 

725 730 735 

GGA GAT GTT GAG GCA CGG CTA GAA GGA GAG CTG GTC CCC TGT CCC CTG 2436 
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Gly Asp Val 
740 

GCA GAA GAG 
Ala Glu Glu 

755 
CGC TAT GAC 
Arg Tyr Asp 
770 

TTC CGG GCA 
Leu Arg Ala 

TAT TGG TCT 
Tyr Trp Ser 

GGG ACT ACA 
Gly Ser Thr 
820 

GAA GCT TTG 
Glu Ala Leu 

835 
GCA GGC TTT 
Ala Gly Phe 
850 

CTC ACC GTC 
Leu Thr Val 

err gtg ccc 

Leu Val Pro 



Glu Ala Arg Leu Glu Gly Glu Leu Val 
745 

AAC GAG TTC ATC CTG TAC GCC 
Asn Glu Phe lie Leu Tyr Ala 
760 

GGA ACC ACG GAG 
Gly Thr Thr Glu 



CTG GCT TCC 
Leu Ala Ser 
775 

GCA GTG GCC 
Ala Val Ala 

790 
GAC CTG GCC 
Asp Leu Ala 
805 

GGA CAA GAG 
Gly Gin Glu 

GCT TTT GAA 
Ala Phe Glu 

AAA AAG ATC 
Lys Lys lie 
855 

GTC AAT TCC 
Val Asn Ser 

870 
CAA GAA GGG 
Gin Glu Gly 



CTG GAC 
Leu Asp 

TTG GAC 
Leu Asp 

GTG ATC 
Val He 
825 
CCC CTC 
Pro Leu 
840 

GAG GTA 
Glu Val 

TCG GTG 
Ser Val 

ATC ATG 
lie Met 



TTT GAC 
Phe Asp 
795 

GTC ATC 
Val He 
810 

ATC AAC 
1 1 e Asn 

AGC CAA 
Ser Gin 

GCC AAT 
Ala Asn 

CTG GAT 
Leu Asp 
875 
TTC TGG 
Phe Trp 



ACG CGC 
Thr Arg 
765 
CAG TTG 
Gin Leu 
780 

TAT GAG 
Tyr Glu 

CAG CGC 
Gin Arg 

TCT GAC 
Ser Asp 

TTA CTT 
Leu Leu 
845 
CCA GAT 
Pro Asp 
860 

CGG CCC 
Arg Pro 

ACC GAC 
Thr Asp 



Pro Cys Pro Leu 
750 

AAG TCC ATC CAC 2484 
Lys Ser He His 

CCC CTC ACT GGG 2532 
Pro Leu Thr Gly 
785 

CAC AAC TGC CTG 2580 
His Asn Cys Leu 
800 

CTC TGT TTG AAC 2628 
Leu Cys Leu Asn 
815 

CTG GAG ACG GTA 2676 

Leu Glu Thr Val 

830 

TAC TGG GTG GAC 2724 
Tyr Trp Val Asp 

GGT GAC TTC CGA 2772 
Gly Asp Phe Arg 
865 

CGG GCC CTG GTC 2820 
Arg Ala Leu Val 
880 

TGG GGA GAC CTG 2868 
Trp Gly Asp Leu 
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885 890 895 

AAG CCT GGG ATT TAT CGG AGC AAC ATG GAC GGA TCT GCC GCC TAT CGC 2916 
Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala Tyr Arg 

900 905 910 

CTC GTG TCG GAG GAT GTG AAG TGG CCC AAT GGC ATT TCC GTG GAC GAT 2964 
Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly He Ser Val Asp Asp 

915 920 925 

CAG TGG ATC TAC TGG ACG GAT GCC TAC CTG GAC TGC ATT GAG CGC ATC 3012 
Gin Trp He Tyr Trp Thr Asp Ala Tyr Leu Asp Cys lie Glu Arg He 
930 935 940 945 

ACG TTC AGC GGC CAG CAG CGC TCC GTC ATC CTG GAC AGA CTC CCG CAC 3060 
Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Arg Leu Pro His 

950 955 960 

CCC TAT GCC ATT GCT GTC TTT AAG AAT GAG ATT TAC TGG GAT GAC TGG 3108 
Pro Tyr Ala He Ala Val Phe Lys Asn Glu He Tyr Trp Asp Asp Trp 

965 970 975 

TCA CAG CTC AGC ATA TTC CGA GCT TCT AAG TAC AGC GGG TCC CAG ATG 3156 
Ser Gin Leu Ser He Phe Arg Ala Ser Lys Tyr Ser Gly Ser Gin Met 

980 '985 990 

GAG ATT CTG GCC AGC CAG CTC ACG GGG CTG ATG GAC ATG AAG ATC TTC 3204 
Glu He Leu Ala Ser Gin Leu Thr Gly Leu Met Asp Met Lys He Phe 

995 1000 1005 

TAC AAG GGG AAG AAC ACA GGA AGC AAT GCG TGT GTA CCC AGG CCG TGC 3252 
Tyr Lys Gly Lys Asn Thr Gly Ser Asn Ala Cys Val Pro Arg Pro Cys 
1010 1015 1020 1025 

AGC CTG CTG TGC CTG CCC AGA GCC AAC AAC AGC AAA AGC TGC AGG TGT 3300 
Ser Leu Leu Cys Leu Pro Arg Ala Asn Asn Ser Lys Ser Cys Arg Cys 
1030 1035 1040 
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CCA GAT GGC GTG GCC AGC ACT GTC CTC CCT TCC GGG GAC CTG ATG TGT 3348 
Pro Asp Gly Val Ala Ser Ser Val Leu Pro Ser Gly Asp Leu Met Cys 

1045 1050 1055 

GAC TGC CCT AAG GGC TAC GAG CTG AAG AAC AAC ACG TGT GTC AAA GAA 3396 
Asp Cys Pro Lys Gly Tyr Glu Leu Lys Asn Asn Thr Cys Val Lys Glu 

1060 1065 1070 

GAA GAC ACC TGT CTG CGC AAC CAG TAC CGC TGC AGC AAC GGG AAC TGC 3444 
Glu Asp Thr Cys Leu Arg Asn Gin Tyr Arg Cys Ser Asn Gly Asn Cys 

1075 1080 1085 

ATC AAC AGC ATC TGG TGG TGC GAT TTC GAC AAC GAC TGC GGA GAC ATG 3492 
He Asn Ser He Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly Asp Met 
1090 1095 1100 1105 

AGC GAC GAG AAG AAC TGC CCT ACC ACC ATC TGC GAC CTG GAC ACC CAG 3540 
Ser Asp Glu Lys Asn Cys Pro Thr Thr He Cys Asp Leu Asp Thr Gin 

1110 1115 1120 

TTC CGT TGC CAG GAG TCT GGG ACG TGC ATC CCG CTC TCC TAC AAA TGT 3588 
Phe Arg Cys Gin Glu Ser Gly Thr Cys He Pro Leu Ser Tyr Lys Cys 

1125 1130 1135 

GAC CTC GAG GAT GAC TGT GGG GAC AAC AGT GAC GAA AGG CAC TGT GAA 3636 
Asp Leu Glu Asp Asp Cys Gly Asp Asn Ser Asp Glu Arg His Cys Glu 

1140 1145 1150 

ATG CAC CAG TGC CGG AGC GAC GAA TAC AAC TGC AGC TCG GGC ATG TGC 3684 
Met His Gin Cys Arg Ser Asp Glu Tyr Asn Cys Ser Ser Gly Met Cys 

1155 1160 1165 

ATC CGC TCC TCC TGG GTG TGC GAC GGG GAC AAC GAC TGC AGG GAC TGG 3732 
He Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg Asp Trp 
1170 1175 1180 1185 

TCC GAC GAG GCC AAC TGC ACA GCC ATC TAT CAC ACC TGT GAG GCC TCC 3780 
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Ser Asp Glu Ala Asn Cys Thr Ala He Tyr His Thr Cys Glu Ala Ser 

H90 H95 1200 

AAC TTC CAG TGC CGC AAC GGG CAC TGC ATC CCC CAG CGG TGG GCG TGT 3828 
Asn Phe Gin Cys Arg Asn Gly His Cys He Pro Gin Arg Trp Ala Cys 

1205 1210 1215 

GAC GGC GAC GCC GAC TGC CAG GAT GGC TCT GAT GAG GAT CCA GCC AAC 3876 
Asp Gly Asp Ala Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro Ala Asn 

1220 1225 1230 

TGT GAG AAG AAG TGC AAC GGC TTC CGC TGC CCG AAC GGC ACC TGC ATT 3924 
Cys Giu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Gly Thr Cys lie 

1235 1240 1245 

CCC TCC ACC AAG CAC TGT GAC GGC CTG CAC GAT TGC TCG GAC GGC TCC 3972 
Pro Ser Thr Lys His Cys Asp Gly Leu His Asp Cys Ser Asp Gly Ser 
1250 1255 1260 1265 

GAC GAG CAG CAC TGC GAG CCC CTG TGT ACA CGG TTC ATG GAC TTC GTG 4020 
Asp Glu Gin His Cys Glu Pro Leu Cys Thr Arg Phe Met Asp Phe Val 

1270 1275 1280 

TGT AAG AAC CGC CAG CAG TGC CTC TTC CAC TCC ATG GTG TGC GAT GGG 4068 
Cys Lys Asn Arg Gin Gin Cys Leu Phe His Ser Met Val Cys Asp Gly 

1285 1290 1295 

ATC ATC CAG TGC CGT GAC GGC TCC GAC GAG GAC CCA GCC TTT GCA GGA 4116 
He He Gin Cys Arg Asp Gly Ser Asp Glu Asp Pro Ala Phe Ala Gly 

1300 1305 1310 

TGC TCC CGA GAC CCC GAG TTC CAC AAG GTG TGC GAT GAG TTC GGC TTC 4164 
Cys Ser Arg Asp Pro Glu Phe His Lys Val Cys Asp Glu Phe Gly Phe 

1315 1320 1325 

CAG TGT CAG AAC GGC GTG TGC ATC AGC TTG ATC TGG AAG TGC GAC GGG 4212 
Gin Cys Gin Asn Gly Val Cys He Ser Leu He Trp Lys Cys Asp Gly 
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1330 1335 1340 1345 

ATG GAT GAC TGC GGG GAC TAC TCC GAC GAG GCC AAC TGT GAA AAC CCC 4260 

Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu Asn Pro 

1350 1355 1360 

ACA GAA GCC CCC AAC TGC TCC CGC TAC TTC CAG TTC CGG TGT GAC AAT 4308 
Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys Asp Asn 

1365 1370 1375 

GGC CAC TGC ATC CCC AAC AGG TGG AAG TGT GAC AGG GAG AAT GAC TGT 4356 
Gly His Cys lie Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn Asp Cys 

1380 1385 1390 

GGG GAC TGG TCC GAC GAG AAG GAC TGT GGA GAT TCA CAT GTA CTT CCG 4404 
Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His Val Leu Pro 

1395 1400 1405 

TCT ACG ACT CCT GCA CCC TCC ACG TGT CTG CCC AAT TAC TAC CGC TGC 4452 
Ser Thr Thr Pro Ala Pro Ser Thr Cys Leu Pro Asn Tyr Tyr Arg Cys 

1410 1415 1420 1425 

GGC GGG GGG GCC TGC GTG ATA GAC ACG TGG GTT TGT GAC GGG TAC CGA 4500 

Gly Gly Gly Ala Cys Val lie Asp Thr Trp Val Cys Asp Gly Tyr Arg 

1430 1435 - 1440 

GAT TGC GCA GAT GGA TCC GAC GAG GAA GCC TGC CCC TCG CTC CCC AAT 4548 
Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Ser Leu Pro Asn 

1445 1450 1455 

GTC ACT GCC ACC TCC TCC CCC TCC CAG CCT GGA CGA TGC GAC CGA TTT 4596 
Val Thr Ala Thr Ser Ser Pro Ser Gin Pro Gly Arg Cys Asp Arg Phe 

1460 1465 1470 

GAG TTT GAG TGC CAC CAG CCA AAG AAG TGC ATC CCT AAC TGG AGA CGC 4644 
Glu Phe Glu Cys His Gin Pro Lys Lys Cys He Pro Asn Trp Arg Arg 
1775 1480 1485 



55 



31 



EP0 773 290 A2 



10 



15 



20 



25 



30 



35 



40 



45 



50 



TGT GAC GGC CAT CAG GAT TGC CAG GAT GGC CAG GAC GAG GCC AAC TGC 4692 
Cys Asp Gly His Gin Asp Cys Gin Asp Gly Gin Asp Glu Ala Asn Cys 
1490 1495 1500 1505 

CCC ACT CAC AGC ACC TTG ACC TGC ATG AGC TGG GAG TTC AAG TGT GAG 4740 
Pro Thr His Ser Thr Leu Thr Cys Met Ser Trp Glu Phe Lys Cys Glu 

1510 1515 1520 

GAT GGC GAG GCC TGC ATC GTG CTG TCA GAA CGC TGC GAC GGC TTC CTG 4788 
Asp Gly Glu Ala Cys lie Val Leu Ser Glu Arg Cys Asp Gly Phe Leu 

1525 1530 1535 

GAC TGC TCA GAT GAG AGC GAC GAG AAG GCC TGC ACT GAT GAG TTA ACT 4836 
Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu Leu Thr 

1540 1545 1550 

GTA TAC AAA GTA CAG AAT CTT CAG TGG ACA GCT GAC TTC TCT GGG AAT 4884 
Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser Gly Asn 

1555 1560 1565 

GTC ACT TTG ACC TGG ATG CGG CCC AAA AAA ATG CCC TCT GCT GCT TGT 4932 
Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala Ala Cys 
1570 1575 1580 1585 

GTA TAC AAC GTG TAC TAT AGA GTT GTT GGA GAG AGC ATA TGG AAG ACT 4980 
Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp Lys Thr 

1590 1595 1600 

CTG GAG ACT CAC AGC AAT AAG ACA AAC ACT GTA TTA AAA GTG TTG AAA 5028 
Leu Glu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val Leu Lys 

1605 1610 1615 

CCA GAT ACC ACC TAC CAG GTT AAA GTG CAG GTT CAG TGC CTG AGC AAG 5076 
Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val Gin Cys Leu Ser Lys 

1620 1625 1630 

GTG CAC AAC ACC AAT GAC TTT GTG ACC TTG AGA ACT CCA GAG GGA TTG 5124 
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Val His Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu Gly Leu 

1635 1640 1645 

CCA GAC GCC CCT CAG AAC CTC CAG CTG TCG CTC CAC GGG GAA GAG GAA 5172 
Pro Asp Ala Pro Gin Asn Leu Gin Leu Ser Leu His Gly Glu Glu Glu 
1650 1655 1660 1665 

GGT GTG ATT GTG GGC CAC TGG AGC CCT CCC ACC CAC ACC CAC GGC CTC 5220 
Gly Val He Val Gly His Trp Ser Pro Pro Thr His Thr His Gly Leu 

1670 1675 1680 

ATT CGC GAA TAC ATT GTA GAG TAT AGC AGG AGT GGT TCC AAG GTG TGG 5268 
He Arg Glu Tyr lie Val Glu Tyr Ser Arg Ser Gly Ser Lys Val Trp 

1685 1690 1695 

ACT TCA GAA AGG GCT GCT AGT AAC TTT ACA GAA ATA AAG AAC TTG TTG 5316 
Thr Ser Glu Arg Ala Ala Ser Asn Phe Thr Glu lie Lys Asn Leu Leu 

1700 1705 1710 

GTC AAC ACC CTG TAC ACC GTC AGA GTG GCT GCG GTG ACG AGT CGT GGG 5364 
Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser Arg Gly 

1715 1720 1725 

ATA GGA AAC TGG AGC GAT TCC AAA TCC ATT ACC ACC GTG AAA GGA AAA 5412 
lie Gly Asn Trp Ser Asp Ser Lys Ser He Thr Thr Val Lys Gly Lys 
1730 1735 1740 1745 

GCG ATC CCG CCA CCA AAT ATC CAC ATT GAC AAC TAC GAT GAA AAT TCC 5460 
Ala He Pro Pro Pro Asn He His He Asp Asn Tyr Asp Glu Asn Ser 

1750 1755 1760 

CTG AGT TTT ACC CTG ACC GTG GAT GGG AAC ATC AAG GTG AAT GGC TAT 5508 
Leu Ser Phe Thr Leu Thr Val Asp Gly Asn He Lys Val Asn Gly Tyr 

1765 1770 1775 

GTG GTG AAC CTT TTC TGG GCA TTT GAC ACC CAC AAA CAA GAG AAG AAA 5556 
Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu Lys Lys 
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1780 1785 1790 

ACC ATG AAC TTC CAA GGG AGC TCA GTG TCC CAC AAA GTT GGC AAT CTG 5604 
Thr Met Asn Phe Gin Gly Ser Ser Val Ser His Lys Val Gly Asn Leu 

1795 1800 1805 

ACA GCA CAG ACG GCC TAT GAG ATT TCC GCC TGG GCC AAG ACT GAC TTG 5652 
Thr Ala Gin Thr Ala Tyr Glu He Ser Ala Trp Ala Lys Thr Asp Leu 
1810 1815 1820 1825 

GGC GAT AGT CCT CTG TCA TTT GAG CAT GTC ACG ACC AGA GGG GTT CGC 5700 
Gly Asp Ser Pro Leu Ser Phe Glu His Val Thr Thr Arg Gly Val Arg 

1830 1835 - 1840 

CCA CCT GCT CCT AGC CTC AAG GCC AGG GCT ATC AAT CAG ACT GCA GTG 5748 
Pro Pro Ala Pro Ser Leu Lys Ala Arg Ala lie Asn Gin Thr Ala Val 

1845 1850 1855 

GAA TGC ACC TGG ACA GGC CCC AGG AAT GTG GTG TAT GGC ATT TTC TAT 5796 
Glu Cys Thr Trp Thr Gly Pro Arg Asn Val Val Tyr Gly He Phe Tyr 

I860 1865 1870 

GCC ACA TCC TTC CTG GAC CTC TAC CGC AAC CCA AGC AGC CTG ACC ACG 5844 
Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Ser Ser Leu Thr Thr 

1875 1880 1885 

CCG CTG CAC AAC GCA ACC GTG CTC GTC GGT AAG GAT GAG CAG TAT CTG 5892 
Pro Leu His Asn Ala Thr Val Leu Val Gly Lys Asp Glu Gin Tyr Leu 
1890 1895 1900 1905 

TTT CTG GTC CGG GTG GTG ATG CCC TAC CAA GGG CCG TCC TCG GAC TAC 5940 
Phe Leu Val Arg Val Val Met Pro Tyr Gin Gly Pro Ser Ser Asp Tyr 

1910 1915 1920 

GTG GTC GTG AAG ATG ATC CCG GAC AGC AGG CTT CCT CCC CGG CAC CTG 5988 
Val Val Val Lys Met He Pro Asp Ser Arg Leu Pro Pro Arg His Leu 
1925 1930 1935 
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CAT GCC GTT CAC ACC GGC AAG ACC TCG GCC GTC ATC AAG TGG GAG TCG 6036 
His Ala Val His Thr Gly Lys Thr Ser Ala Val He Lys Trp Glu Ser 

1940 1945 1950 

CCC TAC GAC TCT CCT GAC CAG GAC CTG TTC TAT GCG ATC GCA GTT AAA 6084 
Pro Tyr Asp Ser Pro Asp Gin Asp Leu Phe Tyr Ala He Ala Val Lys 

1955 1960 1965 

GAT CTG ATA CGA AAG ACG GAC CGG AGC TAC AAA GTC AAG TCC CGC AAC 6132 
Asp Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser Arg Asn 
1970 1975 1980 1985 

AGC ACC GTG GAG TAC ACC CTG AGC AAG CTG GAG CCC GGA GGG AAA TAC 6180 
Ser Thr Val Glu Tyr Thr Leu Ser Lys Leu Glu Pro Gly Gly Lys Tyr 

1990 1995 2000 

CAC GTC ATT GTG CAG CTG GGG AAC ATG AGC AAA GAT GCC AGT GTG AAG 6228 
His Val He Val Gin Leu Gly Asn Met Ser Lys Asp Ala Ser Val Lys 

2005 2010 2015 

ATC ACC ACC GTT TCG TTA TCG GCA CCC GAT GCC TTA AAA ATC ATA ACA 6276 
He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys He He Thr 

2020 2025 2030 

GAA AAT GAC CAC GTC CTT CTC TTC TGG AAA AGT CTA GCT CTA AAG GAA 6324 
Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu Lys Glu 

2035 5 2040 2045 

AAG TAT TTT AAC GAA AGC AGG GGC TAC GAG ATA CAC ATG TTT GAT AGC 6372 
Lys Tyr Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe Asp Ser 
2050 2055 2060 2065 

GCC ATG AAT ATC ACC GCA TAC CTT GGG AAT ACT ACT GAC AAT TTC TTT 6420 
Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn Phe Phe 

2070 2075 2080 

AAA ATT TCC AAC CTG AAG ATG GGT CAC AAT TAC ACA TTC ACG GTC CAG 6468 
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Lys lie Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr Val Gin 

2085 2090 2095 

GCA CGA TGC CTT TTG GGC AGC CAG ATC TGC GGG GAG CCT GCC GTG CTA 6516 
Ala Arg Cys Leu Leu Gly Ser Gin He Cys Gly Glu Pro Ala Val Leu 

2100 2105 2110 

CTG TAT GAT GAG CTG GGG TCT GGT GGC GAT GCG TCG GCG ATG CAG GCT 6564 
Leu Tyr Asp Glu Leu Gly Ser Gly Gly Asp Ala Ser Ala Met Gin Ala 

2115 2120 2125 

GCC AGG TCT ACT GAT GTC GCC GCC GTG GTG GTG CCC ATC CTG TTT CTG 6612 
Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Pro He Leu Phe Leu 
2130 2135 2140 2145 

ATA CTG CTG AGC CTG GGG GTC GGG TTT GCC ATC CTG TAC ACG AAG CAT 6660 
He Leu Leu Ser Leu Gly Val Gly Phe Ala He Leu Tyr Thr Lys His 

2150 2155 2160 

CGG AGG CTG CAG AGC AGC TTC ACC GCC TTC GCC AAC AGC CAC TAC AGC 6708 
Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His Tyr Ser 

2165 2170 2175 

TCC AGA CTC GGC TCC GCC ATC TTC TCC TCT GGG GAT GAC TTG GGG GAG 6756 
Ser Arg Leu Gly Ser Ala He Phe Ser Ser Gly Asp Asp Leu Gly Glu 

2180 2185 2190 

GAT GAT GAA GAT GCT CCT ATG ATC ACT GGA TTT TCG GAC GAC GTC CCC 6804 
Asp Asp Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp Val Pro 

2195 2200 2205 

ATG GTG ATA GCC TGAAAGAGCT TTCCTCACTA GAAACCAAAT GGTGTAAATA 6856 
Met Val He Ala 
2210 

TTTTATTTGA TAAAGATAGT TGATGGTTTA TTTTAAAAGA TGCACTTTGA GTTGCAATAT 6916 
GTTATTTTTA TATGGGCCAA AAACAAAAGC AAAAAAAAAA AAAAA 6961 
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Sequence ID No, 4 

Length of the Sequence: 300 

Type: nucleic acid 

Strandedness : double 

Topology : linear 

Molecular type: cDNA to mRNA 

Sequence : 

ATATCCACAT TGACAGCTAT GGTGAAAATT ATCTAAGCTT CACCCTGACC ATGGAGAGTG 60 
ATATCAAGGT GAATGGCTAT GTGGTGAACC TTTTCTGGGC ATTTGACACC CACAAGCAAG 120 
AGAGGAGAAC TTTGAACTTC CGAGGAAGCA TATTGTCACA CAAAGTTGGC AATCTGACAG 180 
CTCATACATC CTATGAGATT TCTGCCTGGG CCAAGACTGA CTTGGGGGAT AGCCCTCTGG 240 
CATTTGAGCA TGTTATGACC AGAGGGGTTC GCCCACCTGC ACCTAGCCTC AAGGCCAAAG 300 
Sequence ID No* 5 
Length of the Sequence: 6642 
Type : nucleic acid 
Strandedness : double 
Topology : linear 
Molecular type: cDNA to mRNA 
Sequence : 

ATGGCGACAC GGAGCAGCAG GAGGGAGTCG CGACTCCCGT TCCTATTCAC CCTGGTCGCA 60 
CTGCTGCCGC CCGGAGCTCT CTGCGAAGTC TGGACGCAGA GGCTGCACGG CGGCAGCGCG 120 
CCCTTGCCCC AGGACCGGGG CTTCCTCGTG GTGCAGGGCG ACCCGCGCGA GCTGCGGCTG 180 
TGGGCGCGCG GGGATGCCAG GGGGGCGAGC CGCGCGGACG AGAAGCCGCT CCGGAGGAAA 240 
CGGAGCGCTG CCCTGCAGCC CGAGCCCATC AAGGTGTACG GACAGGTTAG TCTGAATGAT 300 
TCCCACAATC AGATGGTGGT GCACTGGGCT GGAGAGAAAA GCAACGTGAT CGTGGCCTTG 360 
GCCCGAGATA GCCTGGCATT GGCGAGGCCC AAGAGCAGTG ATGTGTACGT GTCTTACGAC 420 
TATGGAAAAT CATTCAAGAA AATTTCAGAC AAGTTAAACT TTGGCTTGGG AAATAGGAGT 480 
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GAAGCTGTTA TCGCCCAGTT CTACCACAGC CCTGCGGACA ACAAGCGGTA CATCTTTGCA 540 

GACGCTTATG CCCAGTACCT CTGGATCACG TTTGACTTCT GCAACACTCT TCAAGGCTTT 600 

TCCATCCCAT TTCGGGCAGC TGATCTCCTC CTACACAGTA AGGCCTCCAA CCTTCTCTTG 660 

GGCTTTGACA GGTCCCACCC CAACAAGCAG CTGTGGAAGT CAGATGACTT TGGCCAGACC 720 

TGGATCATGA TTCAGGAACA TGTCAAGTCC TTTTCTTGGG GAATTGATCC CTATGACAAA 780 

CCAAATACCA TCTACATTGA ACGACACGAA CCCTCTGGCT ACTCCACTGT CTTCCGAAGT 840 

ACAGATTTCT TCCAGTCCCG GGAAAACCAG GAAGTGATCC TTGAGGAAGT GAGAGATTTT 900 

CAGCTTCGGG ACAAGTACAT GTTTGCTACA AAGGTGGTGC ATCTCTTGGG CAGTGAACAG 960 

CAGTCTTCTG TCCAGCTCTG GGTCTCCTTT GGCCGGAAGC CCATGAGAGC AGCCCAGTTT 1020 

GTCACAAGAC ATCCTATTAA TGAATATTAC ATCGCAGATG CCTCCGAGGA CCAGGTGTTT 1080 

GTGTGTGTCA GCCACAGTAA CAACCGCACC AATTTATACA TCTCAGAGGC AGAGGGGCTG 1140 

AAGTTCTCCC TGTCCTTGGA GAACGTGCTC TATTACAGCC CAGGAGGGGC CGGCAGTGAC 1200 

ACCTTGGTGA GGTATTTTGC AAATGAACCA TTTGCTGACT TCCACCGAGT GGAAGGATTG 1260 

CAAGGAGTCT ACATTGCTAC TCTGATTAAT GGTTCTATGA ATGAGGAGAA CATGAGATCG 1320 

GTCATCACCT TTGACAAAGG GGGAACCTGG GAGTTTCTTC AGGCTCCAGC CTTCACGGGA 1380 

TATGGAGAGA AAATCAATTG TGAGCTTTCC CAGGGCTGTT CCCTTCATCT GGCTCAGCGC 1440 

CTCAGTCAGC TCCTCAACCT CCAGCTCCGG AGAATGCCCA TCCTGTCCAA GGAGTCGGCT 1500 

CCAGGCCTCA TCATCGCCAC TGGCTCAGTG GGAAAGAACT TGGCTAGCAA GACAAACGTG 1560 

TACATCTCTA GCAGTGCTGG AGCCAGGTGG CGAGAGGCAC TTCCTGGACC TCACTACTAC 1620 

ACATGGGGAG ACCACGGCGG AATCATCACG GCCATTGCCC AGGGCATGGA AACCAACGAG 1680 

CTAAAATACA GTACCAATGA AGGGGAGACC TGGAAAACAT TCATCTTCTC TGAGAAGCCA 1740 

GTGTTTGTGT ATGGCCTCCT CACAGAACCT GGGGAGAAGA GCACTGTCTT CACCATCTTT 1800 

GGCTCGAACA AAGAGAATGT CCACAGCTGG CTGATCCTCC AGGTCAATGC CACGGATGCC I860 

TTGGGAGTTC CCTGCACAGA GAATGACTAC AAGCTGTGGT CACCATCTGA TGAGCGGGGG 1920 

AATGAGTGTT TGCTGGGACA CAAGACTGTT TTCAAACGGC GGACCCCCCA TGCCACATGC 1980 

TTCAATGGAG AGGACTTTGA CAGGCCGGTG GTCGTGTCCA ACTGCTCCTG CACCCGGGAG 2040 

GACTATGAGT GTGACTTCGG TTTCAAGATG AGTGAAGATT TGTCATTAGA GGTTTGTGTT 2100 

CCAGATCCGG AATTTTCTGG AAAGTCATAC TCCCCTCCTG TGCCTTGCCC TGTGGGTTCT 2160 
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ACTTACAGGA GAACGAGAGG CTACCGGAAG ATTTCTGGGG ACACTTGTAG CGGAGGAGAT 2220 

GTTGAAGCGC GACTGGAAGG AGAGCTGGTC CCCTGTCCCC TGGCAGAAGA GAACGAGTTC 2280 

ATTCTGTATG CTGTGAGGAA ATCCATCTAC CGCTATGACC TGGCCTCGGG AGCCACCGAG 2340 

CAGTTGCCTC TCACCGGGCT ACGGGCAGCA GTGGCCCTGG ACTTTGACTA TGAGCACAAC 2400 

TGTTTGTATT GGTCCGACCT GGCCTTGGAC GTCATCCAGC GCCTCTGTTT GAATGGAAGC 2460 

ACAGGGCAAG AGGTGATCAT CAATTCTGGC CTGGAGACAG TAGAAGCTTT GGCTTTTGAA 2520 

CCCCTCAGCC AGCTGCTTTA CTGGGTAGAT GCAGGCTTCA AAAAGATTGA GGTAGCTAAT 2580 

CCAGATGGCG ACTTCCGACT CACAATCGTC AATTCCTCTG TGCTTGATCG TCCCAGGGCT 2640 

CTGGTCCTCG TGCCCCAAGA GGGGGTGATG TTCTGGACAG ACTGGGGAGA CCTGAAGCCT 2700 

GGGATTTATC GGAGCAATAT GGATGGTTCT GCTGCCTATC ACCTGGTGTC TGAGGATGTG 2760 

AAGTGGCCCA ATGGCATCTC TGTGGACGAC CAGTGGATTT ACTGGACGGA TGCCTACCTG 2820 

GAGTGCATAG AGCGGATCAC GTTCAGTGGC CAGCAGCGCT CTGTCATTCT GGACAACCTC 2880 

CCGCACCCCT ATGCCATTGC TGTCTTTAAG AATGAAATCT ACTGGGATGA CTGGTCACAG 2940 

CTCAGCATAT TCCGAGCTTC CAAATACAGT GGGTCCCAGA TGGAGATTCT GGCAAACCAG 3000 

CTCACGGGGC TCATGGACAT GAAGATTTTC TACAAGGGGA AGAACACTGG AAGCAATGCC 3060 

TGTGTGCCCA GGCCATGCAG CCTGCTGTGC CTGCCCAAGG CCAACAACAG TAGAAGCTGC 3120 

AGGTGTCCAG AGGATGTGTC CAGCAGTGTG CTTCCATCAG GGGACCTGAT GTGTGACTGC 3180 

CCTCAGGGCT ATCAGCTCAA GAACAATACC TGTGTCAAAG AAGAGAACAC CTGTCTTCGC 3240 

AACCAGTATC GCTGCAGCAA CGGGAACTGT ATCAACAGCA TTTGGTGGTG TGACTTTGAC 3300 

AACGACTGTG GAGACATGAG CGATGAGAGA AACTGCCCTA CCACCATCTG TGACCTGGAC 3360 

ACCCAGTTTC GTTGCCAGGA GTCTGGGACT TGTATCCCAC TGTCCTATAA ATGTGACCTT 3420 
GAGGATGACT GTGGAGACAA CAGTGATGAA AGTCATTGTG AAATGCACCA GTGCCGGAGT 3480 
GACGAGTACA ACTGCAGTTC CGGCATGTGC ATCCGCTCCT CCTGGGTATG TGACGGGGAC 3540 
AACGACTGCA GGGACTGGTC TGATGAAGCC AACTGTACCG CCATCTATCA CACCTGTGAG 3600 
GCCTCCAACT TCCAGTGCCG AAACGGGCAC TGCATCCCCC AGCGGTGGGC GTGTGACGGG 3660 
GATACGGACT GCCAGGATGG TTCCGATGAG GATCCAGTCA ACTGTGAGAA GAAGTGCAAT 3720 
GGATTCCGCT GCCCAAACGG CACTTGCATC CCATCCAGCA AACATTGTGA TGGTCTGCGT 3780 
GATTGCTCTG ATGGCTCCGA TGAACAGCAC TGCGAGCCCC TCTGTACGCA CTTCATGGAC 3840 
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TTTGTGTGTA AGAACCGCCA GCAGTGCCTG TTCCACTCCA TGGTCTGTGA CGGAATCATC 3900 

CAGTGCCGCG ACGGGTCCGA TGAGGATGCG GCGTTTGCAG GATGCTCCCA AGATCCTGAG 3960 

TTCCACAAGG TATGTGATGA GTTCGGTTTC CAGTGTCAGA ATGGAGTGTG CATCAGTTTG 4020 

ATTTGGAAGT GCGACGGGAT GGATGATTGC GGCGATTATT CTGATGAAGC CAACTGCGAA 4080 

AACCCCACAG AAGCCCCAAA CTGCTCCCGC TACTTCCAGT TTCGGTGTGA GAATGGCCAC 4140 

TGCATCCCCA ACAGATGGAA ATGTGACAGG GAGAACGACT GTGGGGACTG GTCTGATGAG 4200 

AAGGATTGTG GAGATTCACA TATTCTTCCC TTCTCGACTC CTGGGCCCTC CACGTGTCTG 4260 

CCCAATTACT ACCGCTGCAG CAGTGGGACC TGCGTGATGG ACACCTGGGT GTGCGACGGG 4320 

TACCGAGATT GTGCAGATGG CTCTGACGAG GAAGCCTGCC CCTTGCTTGC AAACGTCACT 4380 

GCTGCCTCCA CTCCCACCCA ACTTGGGCGA TGTGACCGAT TTGAGTTCGA ATGCCACCAA 4440 

CCGAAGACGT GTATTCCCAA CTGGAAGCGC TGTGACGGCC ACCAAGATTG CCAGGATGGC 4500 

CGGGACGAGG CCAATTGCCC CACACACAGC ACCTTGACTT GCATGAGCAG GGAGTTCCAG 4560 

TGCGAGGACG GGGAGGCCTG CATTGTGCTC TCGGAGCGCT GCGACGGCTT CCTGGACTGC 4620 

TCGGACGAGA GCGATGAAAA GGCCTGCAGT GATGAGTTGA CTGTGTACAA AGTACAGAAT 4680 

CTTCAGTGGA CAGCTGACTT CTCTGGGGAT GTGACTTTGA CCTGGATGAG GCCCAAAAAA 4740 

ATGCCCTCTG CATCTTGTGT ATATAATGTC TACTACAGGG TGGTTGGAGA GAGCATATGG 4800 

AAGACTCTGG AGACCCACAG CAATAAGACA AACACTGTAT TAAAAGTCTT GAAACCAGAT 4860 

ACCACGTATC AGGTTAAAGT ACAGGTTCAG TGTCTCAGCA AGGCACACAA CACCAATGAC 4920 

TTTGTGACCC TGAGGACCCC AGAGGGATTG CCAGATGCCC CTCGAAATCT CCAGCTGTCA 4980 

CTCCCCAGGG AAGCAGAAGG TGTGATTGTA GGCCACTGGG CTCCTCCCAT CCACACCCAT 5040 

GGCCTCATCC GTGAGTACAT TGTAGAATAC AGCAGGAGTG GTTCCAAGAT GTGGGCCTCC 5100 

CAGAGGGCTG CTAGTAACTT TACAGAAATC AAGAACTTAT TGGTCAACAC TCTATACACC 5160 

GTCAGAGTGG CTGCGGTGAC TAGTCGTGGA ATAGGAAACT GGAGCGATTC TAAATCCATT 5220 

ACCACCATAA AAGGAAAAGT GATCCCACCA CCAGATATCC ACATTGACAG CTATGGTGAA 5280 

AATTATCTAA GCTTCACCCT GACCATGGAG AGTGATATCA AGGTGAATGG CTATGTGGTG 5340 

AACCTTTTCT GGGCATTTGA CACCCACAAG CAAGAGAGGA GAACTTTGAA CTTCCGAGGA 5400 

AGCATATTGT CACACAAAGT TGGCAATCTG ACAGCTCATA CATCCTATGA GATTTCTGCC 5460 

TGGGCCAAGA CTGACTTGGG GGATAGCCCT CTGGCATTTG AGCATGTTAT GACCAGAGGG 5520 
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GTTCGCCCAC 


CTGCACCTAG 


CCTCAAGGCC 


AAAGCCATCA 


ACCAGACTGC AGTGGAATGT 


5580 


ACCTGGACCG 


GCCCCCGGAA 


TGTGGTTTAT 


GGTATTTTCT 


ATGCCACGTC CTTTCTTGAC 


5640 


CTCTATCGCA 


ACCCGAAGAG 


CTTGACTACT 


TCACTCCACA 


ACAAGACGGT CATTGTCAGT 


5700 


AAGGATGAGC 


AGTATTTGTT 


TCTGGTCCGT 


GTAGTGGTAC 


CCTACCAGGG GCCATCCTCT 


5760 


GACTACGTTG 


TAGTGAAGAT 


GATCCCGGAC 


AGCAGGCTTC 


CACCCCGTCA CCTGCATGTG 


5820 


GTTCATACGG 


GCAAAACCTC 


CGTGGTCATC 


AAGTGGGAAT 


CACCGTATGA CTCTCCTGAC 


5880 


CAGGACTTGT 


TGTATGCAAT 


TGCAGTCAAA 


GATCTCATAA 


GAAAGACTGA CAGGAGCTAC 


5940 


AAAGTAAAAT 


CCCGTAACAG 


CACTGTGGAA 


TACACCCTtA 


ACAAGTTGGA GCCTGGCGGG 


6000 


AAATACCACA 


TCATTGTCCA 


ACTGGGGAAC 


ATGAGCAAAG 


ATTCCAGCAT AAAAATTACC 


6060 


ACAGTTTCAT 


TATCAGCACC 


TGATGCCTTA 


AAAATCATAA 


CAGAAAATGA TCATGTTCTT 


6120 


CTCTTTTGGA 


AAAGCCTGGC 


TTTAAAGGAA 


AAGCATTTTA 


ATGAAAGCAG GGGCTATGAG 


6180 


ATACACATGT 


TTGATAGTGC 


CATGAATATC 


ACAGCTTACC 


TTGGGAATAC TACTGACAAT 


6240 


TTCTTTAAAA 


TTTCCAACCT 


GAAGATGGGT 


CATAATTACA 


CGTTCACCGT CCAAGCAAGA 


6300 


TGCCTTTTTG 


GCAACCAGAT 


CTGTGGGGAG 


CCTGCCATCC 


TGCTGTACGA TGAGCTGGGG 


6360 


TCTCGTGCAG 


ATGCATCTGC 


AACGCAGGCT 


GCCAGATCTA 


CGGATGTTGC TGCTGTGGTG 


6420 


GTGCCCATCT 


TATTCCTGAT 


ACTGCTGAGC 


CTGGGGGTGG 


GGTTTGCCAT CCTGTACACG 


6480 


AAGCACCGGA 


GGCTGCAGAG 


CAGCTTCACC 


GCCTTCGCCA 


ACAGCCACTA CAGCTCCAGG 


6540 


CTGGGGTCCG 


CAATCTTeTC 


CTCTGGGGAT 


GACCTGGGGG 


AAGATGATGA AGATGCCCCT 


6600 


ATGATAACTG 


GATTTTCAGA 


TGACGTCCCC 


ATGGTGATAG 


CC 


6642 


Sequence 


ID No. 6 











Length of the Sequence: 2214 
Type : amino acid 
Topology: linear 
Molecular type: Protein 
Sequence : 



Met Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe 

5 10 15 

Thr Leu Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr 
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20 25 30 

Gin Arg Leu His Gly Gly Ser Ala Pro Leu Pro Gin Asp Arg Gly Phe 

35 40 45 

Leu Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Ala Arg Gly 

50 55 6Q 

Asp Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Lys 
65 ™ 75 80 

Arg Ser Ala Ala Leu Gin Pro Glu Pro He Lys Val Tyr Giy Gin Val 

85 90 95 

Ser Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu 

100 105 no 

Lys Ser Asn Val He Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala 

115 120 125 

Arg Pro Lys Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser 

130 135 140 

Phe Lys Lys He Ser Asp Lys Leu Asn Phe Gly Leu Gly Asn Arg Ser 
145 150 155 160 

Glu Ala Val He Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg 

165 170 175 

Tyr lie Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp He Thr Phe Asp 

180 185 190 

Phe Cys Asn Thr Leu Gin Gly Phe Ser lie Pro Phe Arg Ala Ala Asp 

195 200 205 

Leu Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg 

210 215 220 

Ser His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr 
225 230 235 240 

Trp He Met He Gin Glu His Val Lys Ser Phe Ser Trp Gly He Asp 
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245 250 255 

Pro Tyr Asp Lys Pro Asn Thr He Tyr lie Glu Arg His Glu Pro Ser 

260 265 270 

Gly Tyr Ser Thr Val Phe Arg Ser Thr Asp Phe Phe Gin Ser Arg Glu 

275 280 285 

Asn Gin Glu Val He Leu Glu Glu Val Arg Asp Phe Gin Leu Arg Asp 

290 295 300 

Lys Tyr Met Phe Ala Thr Lys Val Val His Leu Leu Gly Ser Glu Gin 
305 310 315 320 

Gin Ser Ser Val Gin Leu Trp Val Ser Phe Gly Arg Lys Pro Met Arg 

325 330 335 

Ala Ala Gin Phe Val Thr Arg His Pro He Asn Glu Tyr Tyr He Ala 

340 345 350 

Asp Ala Ser Glu Asp Gin Val Phe Val Cys Val Ser His Ser Asn Asn 

355 360 365 

Arg Thr Asn Leu Tyr He Ser Glu Ala Glu Gly Leu Lys Phe Ser Leu 

370 375 380 

Ser Leu Glu Asn Val Leu Tyr Tyr Ser Pro Gly Gly Ala Gly Ser Asp 
385 390 395 400 

Thr Leu Val Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp Phe His Arg 

405 410 415 

Val Glu Gly Leu Gin Gly Val Tyr lie Ala Thr Leu He Asn Gly Ser 

420 425 430 

Met Asn Glu Glu Asn Met Arg Ser Val lie Thr Phe Asp Lys Gly Gly 

435 440 445 

Thr Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys 

450 455 460 

He Asn Cys Glu Leu Ser Gin Gly Cys Ser Leu His Leu Ala Gin Arg 
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465 470 475 480 

Leu Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro He Leu Ser 

485 490 495 

Lys Glu Ser Ala Pro Gly Leu He He Ala Thr Gly Ser Val Gly Lys 

500 505 510 

Asn Leu Ala Ser Lys Thr Asn Val Tyr He Ser Ser Ser Ala Gly Ala 

515 520 525 

Arg Trp Arg Glu Ala Leu Pro Gly Pro His Tyr Tyr Thr Trp Gly Asp 

530 535 540 

His Gly Gly He lie Thr Ala lie Ala Gin Gly Met Glu Thr Asn Glu 
545 550 555 560 

Leu Lys Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Thr Phe He Phe 

565 570 575 

Ser Glu Lys Pro Val Phe Val Tyr Gly Leu Leu Thr Glu Pro Gly Glu 

580 585 . 590 

Lys Ser Thr Val Phe Thr lie Phe Gly Ser Asn Lys Glu Asn Val His 

595 600 605 

Ser Trp Leu He Leu Gin Val Asn Ala Thr Asp Ala Leu Gly Val Pro 

610 615 620 

Cys Thr Glu Asn Asp Tyr Lys Leu Trp Ser Pro Ser Asp Glu Arg Gly 
625 630 635 640 

Asn Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Arg Thr Pro 

645 650 655 

His Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val 

660 665 670 

Ser Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe 

675 680 685 

Lys Met Ser Glu Asp Leu Ser Leu Glu Val Cys Val Pro Asp Pro Glu 
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690 695 700 

Phe Ser Gly Lys Ser Tyr Ser Pro Pro Val Pro Cys Pro Val Gly Ser 
705 710 715 720 

Thr Tyr Arg Arg Thr Arg Gly Tyr Arg Lys He Ser Gly Asp Thr Cys 

725 730 735 

Ser Gly Gly Asp Val Glu Ala Arg Leu Glu Gly Glu Leu Val Pro Cys 

740 745 750 

Pro Leu Ala Glu Glu Asn Glu Phe lie Leu Tyr Ala Val Arg Lys Ser 

755 760 765 

lie Tyr Arg Tyr Asp Leu Ala Ser Gly Ala Thr Glu Gin Leu Pro Leu 

770 775 780 

Thr Gly Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn 
785 790 795 800 

Cys Leu Tyr Trp Ser Asp Leu Ala Leu Asp Val lie Gin Arg Leu Cys 

805 810 815 

Leu Asn Gly Ser Thr Gly Gin Glu Val lie Me Asn Ser Gly Leu Glu 

820 825 830 

Thr Val Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp 

835 840 845 

Val Asp Ala Gly Phe Lys Lys He Glu Val Ala Asn Pro Asp Gly Asp 

850 855 860 

Phe Arg Leu Thr He Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala 
865 870 875 880 

Leu Val Leu Val Pro Gin Glu Gly Val Met Phe Trp Thr Asp Trp Gly 

885 890 895 

Asp Leu Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala 

900 905 910 

Tyr His Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly He Ser Val 



45 



EP 0 773 290 A2 



915 920 925 

Asp Asp Gin Trp He Tyr Trp Thr Asp Ala Tyr Leu Glu Cys He GIu 

930 935 940 

Arg He Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Asn Leu 
945 950 955 g 6 0 

Pro His Pro Tyr Ala He Ala Val Phe Lys Asn Glu He Tyr Trp Asp 

965 970 975 

Asp Trp Ser Gin Leu Ser He Phe Arg Ala Ser Lys Tyr Ser Gly Ser 

980 985 990 

Gin Met Glu He Leu Ala Asn Gin Leu Thr Gly Leu Met Asp Met Lys 

995 1000 1005 

He Phe Tyr Lys Gly Lys Asn Thr Giy Ser Asn Ala Cys Val Pro Arg 

1010 1015 1020 

Pro Cys Ser Leu Leu Cys Leu Pro Lys Ala Asn Asn Ser Arg Ser Cys 
1025 1030 1035 1040 

Arg Cys Pro Glu Asp Val Ser Ser Ser Val Leu Pro Ser Gly Asp Leu 

1045 1050 1055 

Met Cys Asp Cys Pro Gin Gly Tyr Gin Leu Lys Asn Asn Thr Cys Val 

1060 1065 1070 

Lys Glu Glu Asn Thr Cys Leu Arg Asn Gin Tyr Arg Cys Ser Asn Gly 

1075 1080 1085 

Asn Cys He Asn Ser He Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly 

1090 1095 HOO 

Asp Met Ser Asp Glu Arg Asn Cys Pro Thr Thr He Cys Asp Leu Asp 
H05 mo I115 U20 

Thr Gin Phe Arg Cys Gin Glu Ser Gly Thr Cys He Pro Leu Ser Tyr 

1125 H30 H35 

Lys Cys Asp Leu Glu Asp Asp Cys Gly Asp Asn Ser Asp Glu Ser His 
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1140 1145 H50 

Cys Glu Met His Gin Cys Arg Ser Asp Glu Tyr Asn Cys Ser Ser Gly 

1155 1160 1165 

Met Cys He Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg 

1170 1175 H80 

Asp Trp Ser Asp Glu Ala Asn Cys Thr Ala lie Tyr His Thr Cys Glu 
1185 H90 H95 1200 

Ala Ser Asn Phe Gin Cys Arg Asn Gly His Cys He Pro Gin Arg Trp 

1205 1210 1215 

Ala Cys Asp Gly Asp Thr Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro 

1220 1225 1230 

Val Asn Cys Glu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Gly Thr 

1235 1240 1245 

Cys He Pro Ser Ser Lys His Cys Asp Gly Leu Arg Asp Cys Ser Asp 

1250 1255 1260 

Gly Ser Asp Glu Gin His Cys Glu Pro Leu Cys Thr His Phe Met Asp 
1265 1270 1275 1280 

Phe Val Cys Lys Asn Arg Gin Gin Cys Leu Phe His Ser Met Val Cys 

1285 1290 1295 

Asp Gly He He Gin Cys Arg Asp Gly Ser Asp Glu Asp Ala Ala Phe 

1300 1305 1310 

Ala Gly Cys Ser Gin Asp Pro Glu Phe His Lys Val Cys Asp Glu Phe 

1315 1320 1325 

Gly Phe Gin Cys Gin Asn Gly Val Cys He Ser Leu He Trp Lys Cys 

1330 1335 1340 

Asp Gly Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu 
1345 1350 1355 1360 

Asn Pro Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys 
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1365 1370 1375 

Glu Asn Gly His Cys He Pro Asn Arg Trp Lys Cys Asp Arg GIu Asn 

1380 1385 i 3 g 0 

Asp Cys Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His lie 

1395 HOO 1405 

Leu Pro Phe Ser Thr Pro Gly Pro Ser Thr Cys Leu Pro Asn Tyr Tyr 

1410 I4i 5 1420 

Arg Cys Ser Ser Gly Thr Cys Val Met Asp Thr Trp Val Cys Asp Gly 
1425 1430 1435 144 0 

Tyr Arg Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Leu Leu 

1445 1450 1455 

Ala Asn Val Thr Ala Ala Ser Thr Pro Thr Gin Leu Gly Arg Cys Asp 

1460 1465 1470 

Arg Phe Glu Phe Glu Cys His Gin Pro Lys Thr Cys He Pro Asn Trp 

1475 1480 H85 

Lys Arg Cys Asp Gly His Gin Asp Cys Gin Asp Gly Arg Asp Glu Ala 
1490 1495 1500 

Asn Cys Pro Thr His Ser Thr Leu Thr Cys Met Ser Arg Glu Phe Gin 

1505 1510 1515 1520 

Cys Glu Asp Gly Glu Ala Cys He Val Leu Ser Glu Arg Cys Asp Gly 
1525 1530 1535 

Phe Leu Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu 

1540 i54 5 155Q 

Leu Thr Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser 

1555 1560 1565 

Gly Asp Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala 

1570 1575 1580 

Ser Cys Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp 
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1585 1590 1595 1600 

Lys Thr Leu Glu Thr His Ser Asn Lys Thr Asn ?hr Val Leu Lys Val 

1605 1610 1615 

Leu Lys Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val Gin Cys Leu 

1620 1625 1630 

Ser Lys Ala His Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu 

1635 1640 1645 

Gly Leu Pro Asp Ala Pro Arg Asn Leu Gin Leu Ser Leu Pro Arg Glu 

1650 1655 1660 

Ala Glu Gly Val He Val Gly His Trp Ala Pro Pro He His Thr His 
1665 1670 1675 1680 

Gly Leu He Arg Glu Tyr He Val Glu Tyr Ser Arg Ser Gly Ser Lys 

1685 1690 1695 

Met Trp Ala Ser Gin Arg Ala Ala Ser Asn Phe Thr Glu He Lys Asn 

1700 1705 1710 

Leu Leu Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser 

1715 1720 1725 

Arg Gly He Gly Asn Trp Ser Asp Ser Lys Ser He Thr Thr He Lys 

1730 1735 1740 

Gly Lys Val He Pro Pro Pro Asp He His He Asp Ser Tyr Gly Glu 
1745 1750 1755 1760 

Asn Tyr Leu Ser Phe Thr Leu Thr Met Glu Ser Asp lie Lys Val Asn 

1765 1770 1775 

Gly Tyr Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu 

1780 1785 1790 

Arg Arg Thr Leu Asn Phe Arg Gly Ser He Leu Ser His Lys Val Gly 

1795 1800 1805 

Asn Leu Thr Ala His Thr Ser Tyr Glu He Ser Ala Trp Ala Lys Thr 
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1810 1815 1820 

Asp Leu Gly Asp Ser Pro Leu Ala Phe Glu His Val Met Thr Arg Gly 
1825 1830 1835 1840 

Val Arg Pro Pro Ala Pro Ser Leu Lys Ala Lys Ala He Asn Gin Thr 

1845 1850 1855 

Ala Val Glu Cys Thr Trp Thr Gly Pro Arg Asn Val Val Tyr Gly He 

I860 1865 . 1870 

Phe Tyr Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Lys Ser Leu 

I 875 1880 1885 

Thr Thr Ser Leu His Asn Lys Thr Val He Val Ser Lys Asp Glu Gin 

1890 1895 1900 

Tyr Leu Phe Leu Val Arg Val Val Val Pro Tyr Gin Gly Pro Ser Ser 
1905 1910 1915 1920 

Asp Tyr Val Val Val Lys Met He Pro Asp Ser Arg Leu Pro Pro Arg 

1925 1930 l935 

His Leu His Val Val His Thr Gly Lys Thr Ser Val Val lie Lys Trp 

1940 1945 1950 

Glu Ser Pro Tyr Asp Ser Pro Asp Gin Asp Leu Leu Tyr Ala He Ala 

195 5 1960 1965 

Val Lys Asp Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser 

1970 1975 1980 

Arg Asn Ser Thr Val Glu Tyr Thr Leu Asn Lys Leu Glu Pro Gly Gly 
1985 1990 1995 2000 

Lys Tyr His He He Val Gin Leu Gly Asn Met Ser Lys Asp Ser Ser 

2005 2010 2015 

He Lys He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys He 

2020 2025 2030 

He Thr Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu 
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2035 2040 2045 

Lys Glu Lys His Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe 

2050 2055 2060 

Asp Ser Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn 
2065 2070 2075 2080 

Phe Phe Lys He Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr 

2085 2090 2095 

Val Gin Ala Arg Cys Leu Phe Gly Asn Gin He Cys Gly Glu Pro Ala 

2100 2105 2110 

He Leu Leu Tyr Asp Glu Leu Gly Ser Gly Ala Asp Ala Ser Ala Thr 

2115 2120 2125 

Gin Ala Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Pro He Leu 

2130 2135 2140 

Phe Leu He Leu Leu Ser Leu Gly Val Gly Phe Ala He Leu Tyr Thr 
2145 2150 2155 2160 

Lys His Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His 

2165 2170 2175 

Tyr Ser Ser Arg Leu Gly Ser Ala He Phe Ser Ser Gly Asp Asp Leu 

2180 2185 2190 

Gly Glu Asp Asp Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp 

2195 2200 2205 

Val Pro Met Val He Ala 
2210 

Sequence ID No. 7 
Length of the Sequence: 6843 
Type: nucleic acid 
Strandedness : double 
Topology: linear 
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Molecular type: cDNA to mRNA 
Feature : 
Name/Key: sig peptide 
Location: 81.. 164 
Identification method: S 
Name/Key: mat peptide 
Location: 165- .6722 
Identification method: S 
Sequence : 

CCG GCCCAGCGGC TCTCCTGGCC 23 
TCGCGCTGCA CATTCTCTCC TGGCGGCGGC GCCACCTGC/V GTAGCGTTCG CCCGAACATG 83 

Met 
1 

GCG kCk CGG AGC AGC AGG AGG GAG TCG CGA CTC CCG TTC CTA TTC ACC 131 
Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe Thr 

5 10 15 

CTG GTC GCA CTG CTG CCG CCC GGA GCT CTC TGC GAA GTC TGG ACG CAG 179 
Leu Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr Gin 

20 25 30 

AGG CTG CAC GGC GGC AGC GCG CCC TTG CCC CAG GAC CGG GGC TTC CTC 227 
Arg Lea His Gly Gly Ser Ala Pro Leu Pro Gin Asp Arg Gly Phe Leu 

35 40 45 

GTG GTG CAG GGC GAC CCG CGC GAG CTG CGG CTG TGG GCG CGC GGG GAT 275 
45 Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Ala Arg Gly Asp 

50 55 60 65 

GCC AGG GGG GCG AGC CGC GCG GAC GAG AAG CCG CTC CGG AGG AAA CGG 323 
Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Lys Arg 
70 75 80 
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AGC GCT GCC CTG CAG CCC GAG CCC ATC A AG GTG TAC GGA CAG GTT ACT 371 
Ser Ala Ala Leu Gin Pro Glu Pro lie Lys Val Tyr Gly Gin Val Ser 

85 90 95 

CTG AAT GAT TCC CAC AAT CAG ATG GTG GTG CAC TGG GCT GGA GAG AAA 419 
Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu Lys 

100 105 HO 

AGC AAC GTG ATC GTG GCC TTG GCC CGA GAT AGC CTG GCA TTG GCG AGG 467 
Ser Asn Val He Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala Arg 

U5 120 125 

CCC AAG AGC AGT GAT GTG TAC GTG TCT TAC GAC TAT GGA AAA TCA TTC 515 
Pro Lys Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser Phe 
130 135 140 145 

AAG AAA ATT TCA GAC AAG TTA AAC TTT GGC TTG GGA AAT AGG AGT GAA 563 
Lys Lys He Ser Asp Lys Leu Asn Phe Gly Leu Gly Asn Arg Ser Glu 

150 155 160 

GCT GIT ATC GCC CAG TTC TAC CAC AGC CCT GCG GAC AAC AAG CGG TAC 611 
Ala Val lie Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg Tyr 

165 170 175 

ATC TTT GCA GAC GCT TAT GCC CAG TAC CTC TGG ATC ACG TTT GAC TTC 659 
He Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp He Thr Phe Asp Phe 

180 185 190 

TGC AAC ACT CTT CAA GGC TTT TCC ATC CCA TTT CGG GCA GCT GAT CTC 707 
Cys Asn Thr Leu Gin Gly Phe Ser He Pro Phe Arg Ala Ala Asp Leu 

195 200 205 

CTC CTA CAC AGT AAG GCC TCC AAC CTT CTC TTG GGC TTT GAC AGG TCC 755 
Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg Ser 
210 215 220 225 

CAC CCC AAC AAG CAG CTG TGG AAG TCA GAT GAC TTT GGC CAG ACC TGG 803 
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His Pro Asn Lys Gin Leu Trp Lys 
230 

ATC ATG ATT CAG GAA CAT GTC AAG 
He Met He Gin Glu His Val Lys 
245 

CCA AAT ACC 
Pro Asn Thr 
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TAT GAC AAA 
Tyr Asp Lys 
260 

TAC TCC ACT 
Tyr Ser Thr 

275 
CAG GAA GTG 
Gin Glu Val 
290 

TAC ATG TTT 
Tyr Met Phe 

TCT TCT GTC 
Ser Ser Val 

GCC CAG TTT 
Ala Gin Phe 
340 

GCC TCC GAG 
Ala Ser Glu 

355 
ACC AAT TTA 
Thr Asn Leu 



GTC TTC CGA 
Val Phe Arg 

ATC CTT GAG 
1 1 e Leu Glu 
295 

GCT ACA AAG 
Ala Thr Lys 

310 
CAG CTC TGG 
Gin Leu Trp 
325 

GTC ACA AGA 
Val Thr Arg 

GAC CAG GTG 
Asp Gin Val 

TAC ATC TCA 
Tyr He Ser 



ATC TAC 
lie Tyr 
265 
ACT ACA 
Ser Thr 
280 

GAA GTG 
Glu Val 

GTG GTG 
Val Val 

GTC TCC 
Val Ser 

CAT CCT 
His Pro 
345 
TTT GTG 
Phe Val 
360 

GAG GCA 
Glu Ala 



Ser Asp Asp Phe Gly Gin Thr Trp 

235 240 
TCC TTT TCT TGG GGA ATT GAT CCC 851 
Ser Phe Ser Trp Gly lie Asp Pro 
250 255 
ATT GAA CGA CAC GAA CCC TCT GGC 899 
He Glu Arg His Glu Pro Ser Gly 
270 

GAT TTC TTC CAG TCC CGG GAA AAC 947 
Asp Phe Phe Gin Ser Arg Glu Asn 
285 

AGA GAT TTT CAG CTT CGG GAC AAG 995 
Arg Asp Phe Gin Leu Arg Asp Lys 
300 305 
CAT CTC TTG GGC AGT GAA CAG CAG 1043 
His Leu Leu Gly Ser Glu Gin Gin 

315 320 
TTT GGC CGG AAG CCC ATG AGA GCA 1091 
Phe Gly Arg Lys Pro Met Arg Ala 
330 335 
ATT AAT GAA TAT TAC ATC GCA GAT 1139 
lie Asn Glu Tyr Tyr He Ala Asp 
350 

TGT GTC AGC CAC AGT AAC AAC CGC 1187 
Cys Val Ser His Ser Asn Asn Arg 
365 

GAG GGG CTG AAG TTC TCC CTG TCC 1235 
Glu Gly Leu Lys Phe Ser Leu Ser 



55 



54 



EP 0 773 290 A2 



10 



is 



so 



25 



30 



35 



40 



50 



370 375 380 385 

TTG GAG AAC GTG CTC TAT TAC AGC CCA GGA GGG GCC GGC AGT GAC ACC 1283 

Leu Glu Asn Val Leu Tyr Tyr Ser Pro Gly Gly Ala Gly Ser Asp Thr 

390 395 400 

TTG GTG AGG TAT TTT GCA AAT GAA CCA TTT GCT GAC TTC CAC CGA GTG 1331 
Leu Val Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp Phe His Arg Val 

405 410 415 

GAA GGA TTG CAA GGA GTC TAC ATT GCT ACT CTG ATT AAT GGT TCT ATG 1379 
Glu Gly Leu Gin Gly Val Tyr He Ala Thr Leu He Asn Gly Ser Met 

420 425 430 

AAT GAG GAG AAC ATG AGA TCG GTC ATC ACC TTT GAC AAA GGG GGA ACC 1427 
Asn Glu Glu Asn Met Arg Ser Val He Thr Phe Asp Lys Gly Gly Thr 

435 440 445 

TGG GAG TTT CTT CAG GCT CCA GCC TTC ACG GGA TAT GGA GAG AAA ATC 1475 
Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys He 
450 455 460 465 

AAT TGT GAG CTT TCC CAG GGC TGT TCC CTT CAT CTG GCT CAG CGC CTC 1523 
Asn Cys Glu Leu Ser Gin Gly Cys Ser Leu His Leu Ala Gin Arg Leu 

470 475 480 

AGT CAG CTC CTC AAC CTC CAG CTC CGG AGA ATG CCC ATC CTG TCC AAG 1571 
Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro He Leu Ser Lys 

485 490 495 

GAG TCG GCT CCA GGC CTC ATC ATC GCC ACT GGC TCA GTG GGA AAG AAC 1619 
Glu Ser Ala Pro Gly Leu lie lie Ala Thr Gly Ser Val Gly Lys Asn 

500 505 510 

TTG GCT AGC AAG ACA AAC GTG TAC ATC TCT AGC AGT GCT GGA GCC AGG 1667 
Leu Ala Ser Lys Thr Asn Val Tyr He Ser Ser Ser Ala Gly Ala Arg 
515 520 525 



55 



EP 0 773 290 A2 



10 



20 



25 



30 



35 



40 



45 



TGG CGA GAG 
Trp Arg Glu 

530 

GGC GGA ATC 
Gly Gly I le 

AAA TAC ACT 
Lys Tyr Ser 

GAG AAG CCA 

Glu Lys Pro 
580 

AGC ACT GTC 
Ser Thr Val 

595 
TGG CTG ATC 
Trp Leu He 
610 

ACA GAG AAT 
Thr Glu Asn 

GAG TGT TTG 
Glu Cys Leu 



GCA CTT 
Ala Leu 

ATC ACG 
He Thr 
550 
ACC AAT 
Thr Asn 
565 

GTG TTT 
Val Phe 

TTC ACC 
Phe Thr 

CTC CAG 
Leu Gin 

GAC TAC 
Asp Tyr 
630 
CTG GGA 
Leu Gly 
645 

TTC AAT 
Phe Asn 



CCT GGA CCT CAC 
Pro Gly Pro His 
535 

GCC ATT GCC CAG 
Ala lie Ala Gin 

GAA GGG GAG ACC 
Glu Gly Glu Thr 
570 

GTG TAT GGC CTC 

Val Tyr Gly Leu 
585 

ATC TTT GGC TCG 
lie Phe Gly Ser 
600 

GTC AAT GCC ACG 
Val Asn Ala Thr 
615 

AAG CTG TGG TCA 
Lys Leu Trp Ser 



TAC TAC ACA 

Tyr Tyr Thr 

540 

GGC ATG GAA 

Gly Met Glu 
555 

TGG AAA ACA 

Trp Lys Thr 

CTC ACA GAA 
Leu Thr Glu 

AAC AAA GAG 
Asn Lys Glu 
605 

GAT GCC TTG 
Asp Ala Leu 

620 
CCA TCT GAT 
Pro Ser Asp 
635 

TTC AAA CGG 
Phe Lys Arg 



TGG GGA 
Trp Gly 

ACC AAC 
Thr Asn 

TTC ATC 
Phe He 
575 
CCT GGG 

Pro Gly 
590 

AAT GTC 
Asn Val 

GGA GTT 
Gly Val 

GAG CGG 
Glu Arg 
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GCC ACA TGC 
Ala Thr Cys 
660 

AAC TGC TCC TGC ACC 



CAC AAG ACT GTT 
His Lys Thr Val 
650 

GGA GAG GAC TTT GAC AGG CCG 
Gly Glu Asp Phe Asp Arg Pro 
665 

CGG GAG GAC TAT GAG TGT GAC 



GAC CAC 
Asp His 
545 
GAG CTA 
Glu Leu 
560 

TTC TCT 
Phe Ser 

GAG AAG 
Glu Lys 

CAC AGC 
His Ser 

CCC TGC 
Pro Cys 
625 
GGG AAT 
Gly Asn 
640 

CCC CAT 
Pro His 



1715 



GTG TCC 
Val Ser 



1763 



1811 



1859 



1907 



1955 



2003 



2051 



CGG ACC 
Arg Thr 
655 
GTG GTC 
Val Val 
670 

TTC GGT TTC AAG 2147 



2099 
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Asn Cys Ser Cys Thr 
675 

ATG AGT GAA GAT TTG 
Met Ser Glu Asp Leu 
690 

TCT GGA AAG 
Ser Gly Lys 



so 



TAC AGG AGA 
Tyr Arg Arg 

GGA GGA GAT 
Gly Gly Asp 
740 

CTG GCA GAA 
Leu Ala Glu 

755 
TAC CGC TAT 
Tyr Arg Tyr 
770 

GGG CTA CGG 
Gly Leu Arg 

TTG TAT TGG 
Leu Tyr Trp 

AAT GGA AGC 
Asn Gly Ser 



TCA TAC 
Ser Tyr 
710 
ACG AGA 
Thr Arg 
725 

GTT GAA 
Val Glu 

GAG AAC 
Glu Asn 

GAC CTG 
Asp Leu 

GCA GCA 
Ala Ala 
790 
TCC GAC 
Ser Asp 
805 

ACA GGG 
Thr Gly 



Arg Glu Asp Tyr 
680 

TCA TTA GAG GTT 
Ser Leu Glu Val 
695 

TCC CCT CCT GTG 
Ser Pro Pro Val 



GGC TAC 
Gly Tyr 

GCG CGA 
Ala Arg 

GAG TTC 
Glu Phe 
760 
GCC TCG 
Ala Ser 
775 

GTG GCC 
Val Ala 

CTG GCC 
Leu Ala 

CAA GAG 
Gin Glu 



CGG AAG 
Arg Lys 
730 

CTG GAA 
Leu Glu 
745 

ATT CTG 
lie Leu 

GGA GCC 
Gly Ala 

CTG GAC 
Leu Asp 

TTG GAC 
Leu Asp 
810 
GTG ATC 
Val lie 



Glu Cys Asp Phe Gly Phe Lys 
685 

TGT GTT CCA GAT CCG GAA TTT 2195 
Cys Val Pro Asp Pro Glu Phe 
700 705 
CCT TGC CCT GTG GGT TCT ACT 2243 
Pro Cys Pro Val Gly Ser Thr 
715 720 
ATT TCT GGG GAC ACT TGT AGC 2291 
He Ser Gly Asp Thr Cys Ser 
735 

GGA GAG CTG GTC CCC TGT CCC 2339 
Gly Glu Leu Val Pro Cys Pro 
750 

TAT GCT GTG AGG AAA TCC ATC 2387 
Tyr Ala Val Arg Lys Ser He 
765 

ACC GAG CAG TTG CCT CTC ACC 2435 
Thr Glu Gin Leu Pro Leu Thr 
780 785 
TTT GAC TAT GAG CAC AAC TGT 2483 
Phe Asp Tyr Glu His Asn Cys 
795 800 
GTC ATC CAG CGC CTC TGT TTG 2531 
Val He Gin Arg Leu Cys Leu 
815 

ATC AAT TCT GGC CTG GAG ACA 2579 
He Asn Ser Gly Leu Glu Thr 
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GTA GAA 
Val Glu 
835 
GAT GCA 
Asp Ala 
850 

CGA CTC 
Arg Leu 

GTC CTC 
Val Leu 

CTG AAG 
Leu Lys 

CAC CTG 
His Leu 
915 
GAC CAG 
Asp Gin 
930 

ATC ACG 
He Thr 



820 

GCT TTG GCT 
Ala Leu Ala 

GGC TTC AAA 
Gly Phe Lys 

ACA ATC GTC 
Thr He Val 
870 

GTG CCC CAA 
Val Pro Gin 

885 
CCT GGG ATT 
Pro Gly He 
900 

GTG TCT GAG 
Val Ser Glu 



TTT GAA 
Phe Glu 
840 
AAG ATT 
Lys lie 
855 

AAT TCC 
Asn Ser 

GAG GGG 
Glu Gly 

TAT CGG 
Tyr Arg 



TGG ATT TAC 
Trp He Tyr 



50 



CAC CCC 
His Pro 



TTC AGT GGC 
Phe Ser Gly 
950 

TAT GCC ATT 
Tyr Ala He 
965 



GAT GTG 
Asp Val 
920 
TGG ACG 
Trp Thr 
935 

CAG CAG 
Gin Gin 

GCT GTC 
Ala Val 



825 

CCC CTC 
Pro Leu 

GAG GTA 
Glu Val 

TCT GTG 
Ser Val 

GTG ATG 
Val Met 

890 
AGC AAT 
Ser Asn 
905 

AAG TGG 
Lys Trp 

GAT GCC 
Asp Ala 

CGC TCT 
Arg Ser 

TTT AAG 
Phe Lys 
970 



AGC CAG CTG 
Ser Gin Leu 
845 

GCT AAT CCA 
Ala Asn Pro 

860 
CTT GAT CGT 
Leu Asp Arg 
875 

TTC TGG ACA 
Phe Trp Thr 

ATG GAT GGT 
Met Asp Gly 

CCC AAT GGC 
Pro Asn Gly 
925 

TAC CTG GAG 
Tyr Leu Glu 

940 
GTC ATT CTG 
Val lie Leu 
955 

AAT GAA ATC 
Asn Glu He 



830 

CTT TAC TGG GTA 
Leu Tyr Trp Val 



2627 



GAT GGC 
Asp Gly 

CCC AGG 
Pro Arg 

GAC TGG 
Asp Trp 

895 
TCT GCT 
Ser Ala 
910 

ATC TCT 
lie Ser 



GAC TTC 
Asp Phe 
865 
GCT CTG 
Ala Leu 
880 

GGA GAC 
Gly Asp 

GCC TAT 
Ala Tyr 

GTG GAC 
Val Asp 



TGC ATA GAG CGG 
Cys He Glu Arg 
945 

GAC AAC CTC CCG 
Asp Asn Leu Pro 
960 

TAC TGG GAT GAC 
Tyr Trp Asp Asp 
975 



2675 



2723 



2771 



2819 



2867 



2915 



2963 



3011 



55 



58 



EP 0 773 290 A2 



10 



15 



SO 



25 



30 



35 



40 



45 



SO 



TGG TCA CAG CTC AGC ATA TTC CGA GCT TCC AAA TAC AGT GGG TCC CAG 3059 
Trp Ser Gin Leu Ser He Phe Arg Ala Ser Lys Tyr Ser Gly Ser Gin 

980 985 990 

ATG GAG ATT CTG GCA AAC CAG CTC ACG GGG CTC ATG GAC ATG AAG ATT 3107 
Met Glu He Leu Ala Asn Gin Leu Thr Gly Leu Met Asp Met Lys He 

995 1000 1005 

TTC TAC AAG GGG AAG AAC ACT GGA AGC AAT GCC TGT GTG CCC AGG CCA 3155 
Phe Tyr Lys Gly Lys Asn Thr Gly Ser Asn Ala Cys Val Pro Arg Pro 
1010 1015 1020 1025 

TGC AGC CTG CTG TGC CTG CCC AAG GCC AAC AAC AGT AGA AGC TGC AGG 3203 
Cys Ser Leu Leu Cys Leu Pro Lys Ala Asn Asn Ser Arg Ser Cys Arg 

1030 1035 1040 

TGT CCA GAG GAT GTG TCC AGC AGT GTG CTT CCA TCA GGG GAC CTG ATG 3251 
Cys Pro Glu Asp Val Ser Ser Ser Val Leu Pro Ser Gly Asp Leu Met 

1045 1050 1055 

TGT GAC TGC CCT CAG GGC TAT CAG CTC AAG AAC AAT ACC TGT GTC AAA 3299 
Cys Asp Cys Pro Gin Gly Tyr Gin Leu Lys Asn Asn Thr Cys Val Lys 

1060 1065 1070 

GAA GAG AAC ACC TGT CTT CGC AAC CAG TAT CGC TGC AGC AAC GGG AAC 3347 
Glu Glu Asn Thr Cys Leu Arg Asn Gin Tyr Arg Cys Ser Asn Gly Asn 

1075 1080 1085 

TGT ATC AAC AGC ATT TGG TGG TGT GAC TTT GAC AAC GAC TGT GGA GAC 3395 
Cys lie Asn Ser He Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly Asp 
1090 1095 1100 1105 

ATG AGC GAT GAG AGA AAC TGC CCT ACC ACC ATC TGT GAC CTG GAC ACC 3443 
Met Ser Asp Glu Arg Asn Cys Pro Thr Thr lie Cys Asp Leu Asp Thr 

1110 1115 H20 

CAG TTT CGT TGC CAG GAG TCT GGG ACT TGT ATC CCA CTG TCC TAT AAA 3491 
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Cln Phe Arg Cys Gin Glu Ser Gly Thr Cys He Pro Leu Ser Tyr Lys 

1125 1130 U35 

TGT GAC CTT GAG GAT GAC TGT GGA GAC AAC AGT GAT GAA ACT CAT TGT 
Cys Asp Leu Glu Asp Asp Cys Gly Asp Asn Ser Asp Glu Ser His Cys 

H40 H45 i 150 

GAA ATG CAC CAG TGC CGG AGT GAC GAG TAC AAC TGC AGT TCC GGC ATG 
Glu Met His Gin Cys Arg Ser Asp Glu Tyr Asn Cys Ser Ser Gly Met 

1155 ileo H65 

TGC ATC CGC TCC TCC TGG GTA TGT GAC GGG GAC AAC GAC TGC AGG GAC 
Cys He Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg Asp 
1170 1I7 5 1180 i 185 

TGG TCT GAT GAA GCC AAC TGT ACC GCC ATC TAT CAC ACC TGT GAG GCC 
Trp Ser Asp Glu Ala Asn Cys Thr Ala He Tyr His Thr Cys Glu Ala 
1190 ii9 5 12fJ0 

TCC AAC TTC CAG TGC CGA AAC GGG CAC TGC ATC CCC CAG CGG TGG GCG 
Ser Asn Phe Gin Cys Arg Asn Gly His Cys lie Pro Gin Arg Trp Ala 

1205 1210 1215 

TGT GAC GGG GAT ACG GAC TGC CAG GAT GGT TCC GAT GAG GAT CCA GTC 
Cys Asp Gly Asp Thr Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro Val 

122 ° 1225 1230 

AAC TGT GAG AAG AAG TGC AAT GGA TTC CGC TGC CCA AAC GGC ACT TGC 
Asn Cys Glu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Gly Thr Cys 

1235 1240 1245 

ATC CCA TCC AGC AAA CAT TGT GAT GGT CTG CGT GAT TGC TCT GAT GGC 
He Pro Ser Ser Lys His Cys Asp Gly Leu Arg Asp Cys Ser Asp Gly 
1250 1255 1260 1265 

TCC GAT GAA CAG CAC TGC GAG CCC CTC TGT ACG CAC TTC ATG GAC TTT 
Ser Asp Glu Gin His Cys Glu Pro Leu Cys Thr His Phe Met Asp Phe 
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1270 1275 1280 

GTG TGT AAG AAC CGC CAG CAG TGC CTG TTC CAC TCC ATG GTC TGT GAC 3971 
Val Cys Lys Asn Arg Gin Gin Cys Leu Phe His Ser Met Val Cys Asp 

1285 1290 1295 

GGA ATC ATC CAG TGC CGC GAC GGG TCC GAT GAG GAT GCG GCG TTT GCA 4019 
Gly He He Gin Cys Arg Asp Gly Ser Asp Glu Asp Ala Ala Phe Ala 

1300 1305 1310 

GGA TGC TCC CAA GAT CCT GAG TTC CAC AAG GTA TGT GAT GAG TTC GGT 4067 
Gly Cys Ser Gin Asp Pro Glu Phe His Lys Val Cys Asp Glu Phe Gly 

1315 1320 1325 

TTC CAG TGT CAG AAT GGA GTG TGC ATC AGT TTG ATT TGG AAG TGC GAC 4115 
Phe Gin Cys Gin Asn Gly Val Cys He Ser Leu lie Trp Lys Cys Asp 
1330 1335 1340 1345 

GGG ATG GAT GAT TGC GGC GAT TAT TCT GAT GAA GCC AAC TGC GAA AAC 4163 
Gly Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu Asn 

1350 1355 1360 

CCC ACA GAA GCC CCA AAC TGC TCC CGC TAC TTC CAG TTT CGG TGT GAG 4211 
Pro Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys Glu 

1365 1370 1375 

AAT GGC CAC TGC ATC CCC AAC AGA TGG AAA TGT GAC AGG GAG AAC GAC 4259 
Asn Gly. His Cys He Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn Asp 

1380 1385 1390 

TGT GGG GAC TGG TCT GAT GAG AAG GAT TGT GGA GAT TCA CAT ATT CTT 4307 
Cys Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His He Leu 

1395 1400 1405 

CCC TTC TCG ACT CCT GGG CCC TCC ACG TGT CTG CCC AAT TAC TAC CGC 4355 
Pro Phe Ser Thr Pro Gly Pro Ser Thr Cys Leu Pro Asn Tyr Tyr Arg 
1410 1415 1420 1425 
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TGC AGC ACT GGG ACC TGC GTG ATG GAC ACC TGG GTG TGC GAC GGG TAC 
Cys Ser Ser Gly Thr Cys Val Met Asp Thr Trp Val Cys Asp Gly Tyr 

1430 1435 1440 

CGA GAT TGT GCA GAT GGC TCT GAC GAG GAA GCC TGC CCC TTG CTT GCA 
Arg Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Leu Leu Ala 

1445 1450 1455 

AAC GTC ACT GCT GCC TCC ACT CCC ACC CAA CTT GGG CGA TGT GAC CGA 
Asn Val Thr Ala Ala Ser Thr Pro Thr Gin Leu Gly Arg Cys Asp Arg 

1460 1465 1470 

TTT GAG TTC GAA TGC CAC CAA CCG AAG ACG TGT ATT CCC AAC TGG AAG 
Phe Glu Phe Glu Cys His Gin Pro Lys Thr Cys He Pro Asn Trp Lys 

1475 1480 i4 8 5 

CGC TGT GAC GGC CAC CAA GAT TGC CAG GAT GGC CGG GAC GAG GCC AAT 
Arg Cys Asp Gly His Gin Asp Cys Gin Asp Gly Arg Asp Glu Ala Asn 

1490 1495 1500 1505 

TGC CCC ACA CAC AGC ACC TTG ACT TGC ATG AGC AGG GAG TTC CAG TGC 

Cys Pro Thr His Ser Thr Leu Thr Cys Met Ser Arg Glu Phe Gin Cys 

1510 1515 1520 

GAG GAC GGG GAG GCC TGC ATT GTG CTC TCG GAG CGC TGC GAC GGC TTC 
Glu Asp Gly Glu Ala Cys He Va! Leu Ser Glu Arg Cys Asp Gly Phe 

1525 1530 1535 

CTG GAC TGC TCG GAC GAG AGC GAT GAA AAG GCC TGC ACT GAT GAG TTG 
Leu Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu Leu 

1540 1545 1550 

ACT GTG TAC AAA GTA CAG AAT CTT CAG TGG ACA GCT GAC TTC TCT GGG 
Thr Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser Gly 

1555 1560 1565 

GAT GTG ACT TTG ACC TGG ATG AGG CCC AAA AAA ATG CCC TCT GCA TCT 
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Asp Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala Ser 

1570 1575 1580 1585 

TGT GTA TAT AAT GTC TAC TAC AGG GTG GTT GGA GAG AGC ATA TGG AAG 4883 

Cys Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp Lys 

1590 1595 1600 

ACT CTG GAG ACC CAC AGC AAT AAG ACA AAC ACT GTA TTA AAA GTC TTG 4931 
Thr Leu Glu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val Leu 

1605 1610 1615 

AAA CCA GAT ACC ACG TAT CAG GTT AAA GTA CAG GTT CAG TGT CTC AGC 4979 
Lys Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val Gin Cys Leu Ser 

1620 1625 1630 

AAG GCA CAC AAC ACC AAT GAC TTT GTG ACC CTG AGG ACC CCA GAG GGA 5027 
Lys Ala His Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu Gly 

1635 1640 1645 

TTG CCA GAT GCC CCT CGA AAT CTC CAG CTG TCA CTC CCC AGG GAA GCA 5075 
Leu Pro Asp Ala Pro Arg Asn Leu Gin Leu Ser Leu Pro Arg Glu Ala 
1650 1655 1660 1665 

GAA GGT GTG ATT GTA GGC CAC TGG GCT CCT CCC ATC CAC ACC CAT GGC 5123 
Glu Gly Val lie Val Gly His Trp Ala Pro Pro He His Thr His Gly 

1670 1675 1680 

CTC ATC CGT GAG TAC ATT GTA GAA TAC AGC AGG AGT GGT TCC AAG ATG 5171 
Leu lie Arg Glu Tyr lie Val Glu Tyr Ser Arg Ser Gly Ser Lys Met 

1685 1690 1695 

TGG GCC TCC CAG AGG GCT GCT AGT AAC TTT ACA GAA ATC AAG AAC TTA 5219 
Trp Ala Ser Gin Arg Ala Ala Ser Asn Phe Thr Glu He Lys Asn Leu 

1700 1705 1710 

TTG GTC AAC ACT CTA TAC ACC GTC AGA GTG GCT GCG GTG ACT AGT CGT 5267 
Leu Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser Arg 
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1715 1720 1725 

GGA ATA GGA AAC TGG AGC GAT TCT AAA TCC ATT ACC ACC ATA AAA GGA 
Gly He Gly Asn Trp Ser Asp Ser Lys Ser He Thr Thr lie Lys Gly 
1730 1T35 1740 1745 

AAA GTG ATC CCA CCA CCA GAT ATC CAC ATT GAC AGC TAT GGT GAA AAT 
Lys Val He Pro Pro Pro Asp lie His He Asp Ser Tyr Gly Glu Asn 

1750 1755 1760 

TAT CTA AGC TTC ACC CTG ACC ATG GAG AGT GAT ATC AAG GTG AAT GGC 
Tyr Leu Ser Phe Thr Leu Thr Met Glu Ser Asp He Lys Val Asn Gly 

1765 1770 1775 

TAT GTG GTG AAC CTT TTC TGG GCA TTT GAC ACC CAC AAG CAA GAG AGG 
Tyr Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu Arg 

1780 1785 1790 

AGA ACT TTG AAC TTC CGA GGA AGC ATA TTG TCA CAC AAA GTT GGC AAT 
Arg Thr Leu Asn Phe Arg Gly Ser He Leu Ser His Lys Val Gly Asn 

1795 1800 1805 

CTG ACA GCT CAT ACA TCC TAT GAG ATT TCT GCC TGG GCC AAG ACT GAC 
Leu Thr Ala His Thr Ser Tyr Glu lie Ser Ala Trp Ala Lys Thr Asp 
1810 1815 1820 1825 

TTG GGG GAT AGC CCT CTG GCA TTT GAG CAT GTT ATG ACC AGA GGG GTT 
Leu Gly Asp Ser Pro Leu Ala Phe Glu His Val Met Thr Arg Gly Val 

1830 1835 1840 

CGC CCA CCT GCA CCT AGC CTC AAG GCC AAA GCC ATC AAC CAG ACT GCA 
Arg Pro Pro Ala Pro Ser Leu Lys Ala Lys Ala He Asn Gin Thr Ala 

1845 1850 1855 

GTG GAA TGT ACC TGG ACC GGC CCC CGG AAT GTG GTT TAT GGT ATT TTC 
Val Glu Cys Thr Trp Thr Gly Pro Arg Asn Val Val Tyr Gly He Phe 
I860 1865 1870 
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TAT GCC ACG TCC TTT CTT GAC CTC TAT CGC AAC CCG AAG AGC TTG ACT 5747 
Tyr Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Lys Ser Leu Thr 

1875 1880 1885 

ACT TCA CTC CAC AAC AAG ACG GTC ATT GTC ACT AAG GAT GAG CAG TAT 5795 
Thr Ser Leu His Asn Lys Thr Val He Val Ser Lys Asp Glu Gin Tyr 
1890 1895 1900 1905 

TTG TTT CTG GTC CGT GTA GTG GTA CCC TAC CAG GGG CCA TCC TCT GAC 5843 
Leu Phe Leu Val Arg Val Val Val Pro Tyr Gin Gly Pro Ser Ser Asp 

1910 1915 1920 

TAC GTT GTA GTG AAG ATG ATC CCG GAC AGC AGG CTT CCA CCC CGT CAC 5891 
Tyr Val Val Val Lys Met He Pro Asp Ser Arg Leu Pro Pro Arg His 

1925 1930 1935 

CTG CAT GTG GTT CAT ACG GGC AAA ACC TCC GTG GTC ATC AAG TGG GAA 5939 
Leu His Val Val His Thr Gly Lys Thr Ser Val Val He Lys Trp Glu 

1940 1945 1950 

TCA CCG TAT GAC TCT CCT GAC CAG GAC TTG TTG TAT GCA ATT GCA GTC 5987 
Ser Pro Tyr Asp Ser Pro Asp Gin Asp Leu Leu Tyr Ala He Ala Val 

1955 1960 1965 

AAA GAT CTC ATA AGA AAG ACT GAC AGG AGC TAC AAA GTA AAA TCC CGT 6035 
Lys Asp Leu lie Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser Arg 
1970 1975 1980 1985 

AAC AGC ACT GTG GAA TAC ACC CTT AAC AAG TTG GAG CCT GGC GGG AAA 6083 
Asn Ser Thr Val Glu Tyr Thr Leu Asn Lys Leu Glu Pro Gly Gly Lys 

1990 1995 2000 

TAC CAC ATC ATT GTC CAA CTG GGG AAC ATG AGC AAA GAT TCC AGC ATA 6131 
Tyr His He He Val Gin Leu Gly Asn Met Ser Lys Asp Ser Ser He 

2005 2010 2015 

AAA ATT ACC ACA GTT TCA TTA TCA GCA CCT GAT GCC TTA AAA ATC ATA 6179 
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Lys He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys lie He 
5 2020 2025 2030 

ACA GAA AAT GAT CAT GTT CTT CTG TTT TGG AAA AGC CTG GCT TTA AAG 6227 
Thr Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu Lys 
" 2035 2040 2045 

GAA AAG CAT TTT AAT GAA AGC AGG GGC TAT GAG ATA CAC ATG TTT GAT 6275 
Glu Lys His Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe Asp 
2050 2055 2060 2065 

AGT GCC ATG AAT ATC ACA GCT TAC CTT GGG AAT ACT ACT GAC AAT TTC 6323 . 
20 Ser Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn Phe 

2070 2075 2080 

TTT AAA ATT TCC AAC CTG AAG ATG GGT CAT AAT TAC ACG TTC ACC GTC 6371 
25 Phe Lys He Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr Val 

2085 2090 2095 

CAA GCA AGA TGC CTT TTT GGC AAC CAG ATC TGT GGG GAG CCT GCC ATC 6419 
30 Gin Ala Arg Cys Leu Phe Gly Asn Gin lie Cys Gly Glu Pro Ala He 

2100 2105 2110 

CTG CTG TAC GAT GAG CTG GGG TCT GGT GCA GAT GCA TCT GCA ACG CAG 6467 

35 

Leu Leu Tyr Asp Glu Leu Gly Ser Gly Ala Asp Ala Ser Ala Thr Gin 
2115 2120 2125 

40 GCT GCC AGA TCT ACG GAT GTT GCT GCT GTG GTG GTG CCC ATC TTA TTC 6515 

Ala Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Pro He Leu Phe 
2130 2135 2140 2145 

4s CTG ATA CTG CTG AGC CTG GGG GTG GGG TTT GCC ATC CTG TAC ACG AAG 6563 

Leu He Leu Leu Ser Leu Gly Val Gly Phe Ala lie Leu Tyr Thr Lys 

2150 2155 2160 

CAC CGG AGG CTG CAG AGC AGC TTC ACC GCC TTC GCC AAC AGC CAC TAC 6611 
His Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His Tyr 
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2165 2170 2175 

AGC TCC AGG CTG GX TCC GCA ATC TTC TCC TCT GGG GAT GAC CTG GGG 6659 
Ser Ser Arg Leu Gly Ser Ala He Phe Ser Ser Gly Asp Asp Leu Gly 

2180 2185 2190 

GAA GAT GAT GAA GAT GCC CCT ATG ATA ACT GGA TTT TCA GAT GAC GTC 6707 
Glu Asp Asp Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp Val 

2195 2200 2205 

CCC ATG GTG ATA GCC TGAAAGAGCT TTCCTCACTA GAAACCAAAT GGTGTAAATA 6762 
Pro Met Val He Ala 

2210 

TTTTATTTGA TAAAGATAGT TGATGGTTTA TTTTAAAAGA TGCACTTTGA GTTGCAATAT 6822 
GTTATTTTTA TATGGGCCAA A 6843 
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(1) GENERAL INFORMATION; 



(i) APPLICANT: 

(A) NAME: KOWA CO., LTD 

(?) ?S?f Aichi 9 ' N1Shiki 3 - Ch ° me < N *ka-ku, Nagoya-shi, 

(E) COUNTRY: Japan 

(F) POSTAL CODE (ZIP) : none 

"^'c^SRErSr ldl receptor mum ° protein ™> ™= 

(iii) NUMBER OF SEQUENCES: 7 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(2) INFORMATION FOR SEQ ID NO : 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6639 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

ATGGCGACAC GGAGCAGCAG GAGGGAGTCG CGACTCCCCT TCCTATTCAC CCTGGTCGCG 60 

35 CTGCTGCCGC CCGGGGCTCT CTGCGAGGTG TGGACGCGGA CACTGCACGG CGGCCGCGCG 120 

CCCTTACCCC AGGAGCGGGG CTTCCGCGTG GTGCAGGGCG ACCCGCGCGA GCTGCGGCTG 180 

TGGGAGCGCG GGGATGCCAG GGGGGCGAGC CGGGCGGACG AGAAGCCGCT CCGGAGGAGA 24 0 

40 CGGAGCGCTG CCCTGCAGCC CGAGCCCATC AAGGTGTACG GACAGGTCAG CCTCAATGAT 300 

TCCCACAATC AGATGGTGGT GCACTGGGCC GGAGAGAAAA GCAACGTGAT CGTGGCCTTG 360 

GCCCGGGACA GCCTGGCGTT GGCCAGGCCC AGGAGCAGTG ATGTGTACGT GTCTTATGAC 420 

45 TATGGAAAAT CATTCAATAA GATTTCAGAG AAATTGAACT TCGGCGCGGG AAATAACACA 480 

GAGGCTGTGG TGGCCCAGTT CTACCACAGC CCTGCGGACA ACAAACGGTA CATCTTCGCA 54 0 

GATGCCTACG CCCAGTATCT CTGGATCACG TTTGACTTCT GCAACACCAT CCATGGCTTT 600 

TCCATCCCGT TCCGGGCAGC TGATCTCCTA CTCCACAGTA AGGCCTCCAA CCTTCTCCTG 660 

GGCTTCGACA GGTCTCACCC CAACAAGCAG CTGTGGAAGT CGGATGATTT TGGCCAGACC 720 

TGGATCATGA TTCAAGAACA CGTGAAGTCC TTTTCTTGGG GAATTGATCC CTATGACAAA 780 
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CCAAACACCA 


TCTACATCGA 


ACGGCACGAA 


CCTTVTGGCT 


ACTCCACGGT 


TTTCCGAAGT 


840 


ACAGACTTCT 


TCCAGTCCCG 


GGAAAA.CCAG 


GAAGTGATCT 


TGGAGGAAGT 


GAGAGACTTT 


900 


CAGCTTCGGG 


ACAAGTACAT 


GTTTGCTACA 


AAGGTGGTGC 


ATCTCTTGGG 


CAGTCCACTG 


960 


CAGTCTTCTG 


TCCAGCTCTG 


GGTCTCCTTT 


GGCCGGAAGC 


CCATGCGGGC 


CGCCCAGTTT 


1020 


GTTACAAGAC 


ATCCTATCAA 


CGAATATTAC 


ATCGCGGATG 


CCTCGGAGGA 


CCAGGTGTTT 


1080 


GTGTGTGTCA 


GTCACAGCAA 


CAACCGCACC 


AACCTCTACA 


TCTCGGAGGC 


AGAGGGCTTG 


1140 


AAGTTCTCTC 


TGTCCCTGGA 


GAACGTGCTC 


TACTACACCC 


CGGGAGGGGC 


CGGCAGTGAC 


1200 


ACCTTGGTGA 


GGTACTTTGC 


AAATGAACCG 


TTTGCTGACT 


TCCATCGTGT 


GGAAGGGTTG 


1260 


CAGGGAGTCT 


ACATTGCTAC 


TCTGATTAAT 


GGTTCTATGA 


ATGAGGAGAA 


CATGAGATCT 


1320 


GT CAT C AC CT 


TTGACAAAGG 


GGGCACCTGG 


GAATTTCTGC 


AGGCTCCAGC 


CTTCACGGGG 


1380 


TATGGAGAGA 


AAATCAACTG 


TGAGCTGTCC 


GAGGGCTGTT 


CCCTCCACCT 


GGCCCAGCGC 


1440 


CTCAGCCAGC 


TGCTCAACCT 


CCAGCTCCGG 


AGGATGCCCA 


TCCTGTCCAA 


GGAGTCGGCG 


1500 


CCTGGCCTCA 


TCATTGCCAC 


GGGCTCAGTG 


GGAAAGAACT 


TGGCTAGCAA 


GACAAACGTG 


1560 


TACATCTCTA 


GCAGTGCTGG 


AGCCAGGTGG 


CGAGAGGCAC 


TTCCTGGACC 


TCACTACTAT 


1620 


ACATGGGGAG 


ACCATGGCGG 


CATCATCATG 


GCCATTGCCC 


AAGG CATGG A 


AACGAACGAA 


1680 


CTGAAGTACA 


GTACCAACGA 


AGGGGAGACC 


TGGAAAGCCT 


TCACCTTCTC 


TGAGAAGCCC 


174 0 


GTGTTTGTGT 


ATGGGCTCCT 


CACGGAACCC 


GGCGAGAAGA 


GCACGGTCTT 


CACCATCTTT 


1800 


GGCTCCAACA 


AGGAGAACGT 


GCACAGCTGG 


CTCATCCTCC 


AGGTCAATGC 


CACAGACGCC 


186 0 


CTGGGGGTTC 


CTTGCACAGA 


GAACGACTAC 


AAGCTCTGGT 


CACCATCTGA 


TGAGCGGGGG 


1920 


AATGAGTGTT 


TGCTTGGACA 


CAAGACTGTT 


TTCAAACGGA 


GGACCCCGCA 


CGCCACATGC 


1980 


TTTAACGGAG 


AAGACTTTGA 


CAGGCCGGTG 


GTTGTGTCCA 


ACTGCTCCTG 


CAC C CGGGAG 


2040 


GACTATGAGT 


GTGACTTTGG 


CTTCCGGATG 


AGTGAAGACT 


TGGCATTAGA 


GGTGTGTGTT 


2100 


CCAGATCCAG 


GATTTTCTGG 


AAAGTCCTCC 


CCTCCAGTGC 


CTTGTC CCGT 


GGGCTCTACG 


21bU 


TACAGGCGAT 


CAAGAGGCTA 


CCGGAAGATT 


TCTGGGGACA 


CCTGTAGTGG 


AGGAGATGTT 




GAGGCACGGC 


TAGAAGGAGA 


GCTGGTCCCC 


TGTCCC CTGG 


CAGAAGAGAA 


CGAGTTCAI C_ 


"j o q n 


CTGTACGCCA 


CGCGCAAGTC 


CATCCACCGC 


TATGACCTGG 


CTT C CGGAAC 


CACGGAGCAG 


U 


TTGCCCCTCA 


CTGGGTTGCG 


GGCAGCAGTG 


GCCCTGGACT 


TTGACTATGA 


GCACAACTGC 


2400 


CTGTATTGGT 


CTGACCTGGC 


CTTGGACGTC 


ATCCAGCGCC 


TCTGTTTGAA 


CGGGAGTACA 


2460 


GGACAAGAGG 


TGATCATCAA 


CTCTGACCTG 


GAGACGGTAG 


AAGCTTTGGC 


TTTTGAACCC 


2520 


CTCAGCCAAT 


TACTTTACTG 


GGTGGACGCA 


GG CTTTAAAA 


AGATCGAGGT 


AGCCAATCCA 


2580 


GATGGTGACT 


TCCGACTCAC 


CGTCGTCAAT 


TCCTCGGTGC 


TGGATCGGCC 


CCGGGCCCTG 


2640 


GTCCTTGTGC 


CCCAAGAAGG 


GATCATGTTC 


TGGACCGACT 


GGGGAGACCT 


GAAGCCTGGG 


2700 
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ATTTATCGGA GCAACATGGA CGGATCTGCC GCCTATCCCC TCGTvJ TCGGA GGATGTGAAG 2760 

TGGCCCAATG GCATTTCCGT GGACGATCAG TGGATCTACT GGACGGATGC CTACCTGGAC 2820 

TGCATTGAGC GCATCACGTT CAGCGGCCAG CAGCGCTCCG TCATCCTGGA CAGACTCCCG 2 880 

CACCCCTATG CCATTGCTGT CTTTAAGAAT GAGATTTACT GGGATGACTG GTCACAGCTC 2 94 0 

AGCATATTCC GAGCTTCTAA GTACAGCGGG TCCCAGATGG AGATTCTGGC CAGCCAGCTC 3 000 

ACGGGGCTGA TGGACATGAA GATCTTCTAC AAGGGGAAGA ACACAGGAAG CAATGCGTGT 3060 

GTACCCAGGC CGTGCAGCCT GCTGTGCCTG CCCAGAGCCA ACAACAGCAA AAGCTGCAGG 3120 

TGTCCAGATG GCGTGGCCAG CAGTGTCCTC CCTTCCGGGG ACCTGATGTG TGACTGCCCT 3180 

AAGGGCTACG AGCTGAAGAA CAACACGTGT GTCAAAGAAG AAGACACCTG TCTGCGCAAC 3 240 

CAGTAC CGCT GCAGCAACGG GAACTGCATC AACAG CAT CT GGTGGTGCGA TTTCGACAAC 33 00 

GACTGCGGAG ACATGAGCGA CGAGAAGAAC TGCCCTACCA CCATCTGCGA CCTGGACACC 33 60 

20 CAGTTCCGTT GCCAGGAGTC TGGGACGTGC ATCCCGCTCT CCTACAAATG TGACCTCGAG 3420 

GATGACTGTG GGGACAACAG TGACGAAAGG CACTGTGAAA TGCACCAGTG CCGGAGCGAC 3480 

GAATACAACT GCAGCTCGGG CATGTGCATC CGCTCCTCCT GGGTGTGCGA CGGGGACAAC 3 540 

25 GACTGCAGGG ACTGGTCCGA CGAGGCCAAC TGCACAGCCA TCTATCACAC CTGTGAGGCC 3600 

TCCAACTTCC AGTGCCGCAA CGGGCACTGC ATCCCCCAGC GGTGGGCGTG TGACGGCGAC 3 660 

GCCGACTGCC AGGATGGCTC TGATGAGGAT CCAGCCAACT GTGAGAAGAA GTGCAACGGC 3720 

TTCCGCTGCC CGAACGGCAC CTGCATTCCC TCCACCAAGC ACTGTGACGG CCTGCACGAT 3780 

TGCTCGGACG GCTCCGACGA GCAGCACTGC GAGCCCCTGT GTACACGGTT CATGGACTTC 3840 

GTGTGTAAGA ACCGCCAGCA GTGCCTCTTC CACTCCATGG TGTGCGATGG GATCATCCAG 3900 

TGCCGTGACG GCTCCGACGA GGACCCAGCC TTTGCAGGAT GCTCCCGAGA CCCCGAGTTC 3 960 

CACAAGGTGT GCGATGAGTT CGGCTTCCAG TGTCAGAACG GCGTGTGCAT CAGCTTGATC 4020 

TGGAAGTGCG ACGGGATGGA TGACTGCGGG GACTACTCCG ACGAGGCCAA CTGTGAAAAC 4 080 

CCCACAGAAG CCCCCAACTG CTCCCGCTAC TTCCAGTTCC GGTGTGACAA TGGCCACTGC 4140 

ATCCCCAACA GGTGGAAGTG TGACAGGGAG AATGACTGTG GGGACTGGTC CGACGAGAAG 4200 

GACTGTGGAG ATTCACATGT ACTTCCGTCT ACGACTCCTG CACCCTCCAC GTGTCTGCCC 4260 

AATTACTACC GCTGCGGCGG GGGGGCCTGC GTGATAGACA CGTGGGTTTG TGACGGGTAC 4320 

CGAGATTGCG CAGATGGATC CGACGAGGAA GCCTGCCCCT CGCTCCCCAA TGTCACTGCC 4380 

ACCTCCTCCC CCTCCCAGCC TGGACGATGC GACCGATTTG AGTTTGAGTG CCACCAGCCA 4440 

AAGAAGTGCA TCCCTAACTG GAGACGCTGT GACGGCCATC AGGATTGCCA GGATGGCCAG 4500 

GACGAGGCCA ACTGCCCCAC TCACAGCACC TTGACCTGCA TGAGCTGGGA GTTCAAGTGT 4560 

GAGGATGGCG AGGCCTGCAT CGTGCTGTCA GAACGCTGCG ACGGCTTCCT GGACTGCTCA 4620 
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GATGAGAGCG 


ACGAGAAGGC 


CTGCAGTGAT 


GAGTTAACTG 


TA TACA AAGT 


ACAGAATCTT 


^680 




CAGTGGACAG 


CTGACTTCTC 


TGGGAATGTC 


ACTTTGACCT 


GGATGCGGCC 


CAAAAAAATG 


4740 




CCCTCTGCTG 


CTTGTGTATA 


CAACGTGTAC 


TATAGAGTTG 


TTGGAGAGAG 


CATATGGAAG 


4800 




ACTCTGGAGA 


CTCACAGCAA 


TAAGACAAAC 


ACTGTATTAA 


AAGTGTTGAA 


ACCAGAT AC C 


4 860 




ACCTACCAGG 


TTAAAGTGCA 


GGTT C AGTGC 


CTGAGCAAGG 


TGCACAACAC 


CAATGACTTT 


4920 


10 


GTGACCTTGA 


GAACTCCAGA 


GGGATTGCCA 


GACGCCCCTC 


AGAACCTCCA 


GCTGTCGCTC 


4980 




CACGGGGAAG 


AGGAAGGTGT 


GATTGTGGGC 


CACTGGAGCC 


CTCCCACCCA 


CACCCACGGC 


5040 




CTCATTCGCG 


AATACATTGT 


AGAGTATAGC 


AGGAGTGGTT 


CCAAGGTGTG 


GACTTCAGAA 


5100 


15 


AGGGCTGCTA 


GTAACTTTAC 


AGAAATAAAG 


AACTTGTTGG 


TCAACACCCT 


GTACACCGTC 


5160 




AGAGTGGCTG 


CGGTGACGAG 


TCGTGGGATA 


GGAAACTGGA 


GCGATTCCAA 


ATCC ATTAC C 


5220 




A C CGTG AAAG 


GAAAAGCGAT 


CCCGCCACCA 


AATATCCACA 


TTGACAACTA 


CGATGAAAAT 


5280 


20 


TCCCTGAGTT 


TTACCCTGAC 


CGTGGATGGG 


AACAT CAAGG 


TGAATGGCTA 


TGTGGTGAAC 


5340 




PTTTTCTGGG 

\w X X X X X VJVJVJ 


CATTTGACAC 


CCACAAACAA 


GAGAAGAAAA 


CCATGAACTT 


CCAAGGGAGC 


5400 




TCAGTGTCCC 


ACAAAGTTGG 


CAATCTGACA 


GCACAGACGG 


CCTATGAGAT 


TTCCGCCTGG 


5460 


25 


GCCAAGACTG 


ACTTGGGCGA 


TAGTCCTCTG 


TCATTTGAGC 


ATGTCACGAC 


CAGAGGGGTT 


5520 


CGCCCACCTG 


CTCCTAGCCT 


CAAGGCCAGG 


GCTATCAATC 


AGACTGCAGT 


GGAATGCACC 


5580 




TGGACAGGCC 


CCAGGAATGT 


GGTGTATGGC 


ATTTTCTATG 


CCACATCCTT 


CCTGGACCTC 


5640 


30 


TACCGCAACC 


CAAGCAGCCT 


GACCACGCCG 


CTGCACAACG 


CAACCGTGCT 


CGTCGGTAAG 


5700 


GATGAGCAGT 


ATCTGTTTCT 


GGTC CGGGTG 


GTGATGCCCT 


ACCAAGGGCC 


GTCCTCGGAC 


5760 




TACGTGGTCG 


TGAAGATGAT 


CCCGGACAGC 


AGGCTT CCTC 


CCCGGCACCT 


GCATGCCGTT 


5820 




CACACCGGCA 


AGACCTCGGC 


CGTCATCAAG 


TGGGAGTCGC 


CCTACGACTC 


TCCTGACCAG 


5880 


35 


GACCTGTTCT 


ATG CGATCGC 


AGTTAAAGAT 


CTGATACGAA 


AGACGGACCG 


GAGCTACAAA 


5940 




GTCAAGTCCC 


GCAACAGCAC 


CGTGGAGTAC 


ACCCTGAGCA 


AGCTGGAGCC 


CGGAGGGAAA 


6000 




TACCACGTCA 


TTGTGCAGCT 


GGGGAACATG 


AGCAAAGATG 


CCAGTGTGAA 


GATCACCACC 


6060 


40 


GTTTCGTTAT 

\J XXX \^\J X X X 


CGGCACCCGA 


TGCCTTAAAA 


ATCATAACAG 


AAAATGACCA 


CGTCCTTCTC 


6120 




TTCTGGAAAA 


GTCTAG CTCT 


AAAGGAAAAG 


TATTTTAACG 


AAAGCAGGGG 


CTACGAGATA 


6180 




CACATGTTTG 


ATAGCGCCAT 


GAATATCACC 


GCATACCTTG 


GGAATACTAC 


TGACAATTTC 




45 


TTTAAAATTT 


CCAACCTGAA 


GATGGGTCAC 


AATTACACAT 


TCACGGTCCA 


GGCACGATGC 


6300 




CTTTTGGGCA 


GCCAGATCTG 


CGGGGAGCCT 


GCCGTGCTAC 


TGTATGATGA 


GCTGGGGTCT 


6360 




GGTGGCGATG 


CGTCGGCGAT 


GCAGGCTGCC 


AGGTCTACTG 


ATGTCGCCGC 


CGTGGTGGTG 


6420 


50 


CCCATCCTGT 


TTCTGATACT 


GCTGAGCCTG 


GGGGTCGGGT 


TTGCCATCCT 


GTACACGAAG 


6480 




CATCGGAGGC 


TGCAGAGCAG 


CTTCACCGCC 


TTCGCCAACA 


GCCACTACAG 


CTCCAGACTC 


6540 
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GGCTCCGCCA TCTTCTCCTC TGGGGATGAC TTGGGGGAGG ATGATGAAGA TGCTCCTATG 
ATCACTGGAT TTTCGGACGA CGTCCCCATG GTGATAGCC 
5 (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2213 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 



15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe 
1 5 10 15 

20 Thr Leu Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr 

20 25 30 

Arg Thr Leu His Gly Gly Arg Ala Pro Leu Pro Gin Glu Arg Gly Phe 
35 40 45 

25 Arg Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Glu Arq Gly 

50 55 60 

Asp Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Arg 
65 70 75 80 



30 



35 



40 



Arg Ser Ala Ala Leu Gin Pro Glu Pro lie Lys Val Tyr Gly Gin Val 
85 90 95 

Ser Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu 
100 105 no 

Lys Ser Asn Val lie Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala 
115 120 125 

Arg Pro Arg Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser 
130 135 140 

Phe Asn Lys lie Ser Glu Lys Leu Asn Phe Gly Ala Gly Asn Asn Thr 
145 150 155 160 

Glu Ala Val Val Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg 

165 170 175 

Tyr lie Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp lie Thr Phe Asp 
180 185 190 

Phe Cys Asn Thr lie His Gly Phe Ser lie Pro Phe Arg Ala Ala Asp 
195 200 205 

Leu Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg 
210 215 220 

Ser His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr 
225 230 235 240 
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Trp lie Met lie Gin Glu His Val uys Ser Phe Ser Trp Glv He /^p 
245 250 " 255 

Pro Tyr Asp Lys Pro Asn Thr He Tyr He Glu Arg His Glu Pro Ser 
260 265 270 

Gly Tyr Ser Thr Val Phe Arg Ser Thr Asp Phe Phe Gin Ser Arg Glu 
275 280 285 

Asn Gin Glu Val He Leu Glu Glu Val Arg Asp Phe Gin Leu Arg Asp 
290 295 300 

Lys Tyr Met Phe Ala Thr Lys Val Val His Leu Leu Gly Ser Pro Leu 
305 310 315 320 

Gin Ser Ser Val Gin Leu Trp Val Ser Phe Gly Arg Lys Pro Met Arg 
325 330 335 

Ala Ala Gin Phe Val Thr Arg His Pro He Asn Glu Tyr Tyx He Ala 
340 345 350 

Asp Ala Ser Glu Asp Gin Val Phe Val Cys Val Ser His Ser Asn Asn 
355 360 365 

Arg Thr Asn Leu Tyr He Ser Glu Ala Glu Gly Leu Lys Phe Ser Leu 
370 375 380 

Ser Leu Glu Asn Val Leu Tyr Tyr Thr Pro Gly Gly Ala Gly Ser Asp 
385 390 395 400 

Thr Leu Val Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp Phe His Arg 
405 410 415 

Val Glu Gly Leu Gin Gly Val Tyr He Ala Thr Leu He Asn Gly Ser 
420 425 430 

Met Asn Glu Glu Asn Met Arg Ser Val He Thr Phe Asp Lys Gly Gly 
435 440 445 

Thr Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys 
450 455 460 

He Asn Cys Glu Leu Ser Glu Gly Cys Ser Leu His Leu Ala Gin Arg 
465 470 475 480 

Leu Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro He Leu Ser 
485 490 495 

Lys Glu Ser Ala Pro Gly Leu He He Ala Thr Gly Ser Val Gly Lys 
500 505 510 

Asn Leu Ala Ser Lys Thr Asn Val Tyr He Ser Ser Ser Ala Gly Ala 
515 520 525 

Arg Trp Arg Glu Ala Leu Pro Gly Pro His Tyr Tyr Thr Trp Gly Asp 
530 535 540 

His Gly Gly He He Met Ala He Ala Gin Gly Met Glu Thr Asn Glu 
545 550 555 560 

Leu Lys Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Ala Phe Thr Phe 
565 570 575 
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Ser Glu Lys Pro Val Phe Val Tyr 31/ Leu Leu Tbr Glu Pro Gly GIu 
580 385 590 

Lys Ser Thr Val Phe Thr lie Phe Gly Ser Asn Lys Glu Asn Val His 
595 600 605 

Ser Trp Leu He Leu Gin Val Asn Ala Thr Asp Ala Leu Gly Val Pro 
610 615 620 

Cys Thr Glu Asn Asp Tyr Lys Leu Trp Ser Pro Ser Asp Glu Arg Gly 
625 630 635 640 

Asn Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Arg Thr Pro 
645 650 655 

His Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val 
660 665 670 

Ser Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe 
675 680 685 

Arg Met Ser Glu Asp Leu Ala Leu Glu Val Cys Val Pro Asp Pro Gly 
690 695 700 

Phe Ser Gly Lys Ser Ser Pro Pro Val Pro Cys Pro Val Gly Ser Thr 
705 710 715 720 

Tyr Arg Arg Ser Arg Gly Tyr Arg Lys He Ser Gly Asp Thr Cys Ser 
725 730 735 

Gly Gly Asp Val Glu Ala Arg Leu Glu Gly Glu Leu Val Pro Cys Pro 
740 745 750 

Leu Ala Glu Glu Asn Glu Phe He Leu Tyr Ala Thr Arg Lys Ser He 
755 760 765 

His Arg Tyr Asp Leu Ala Ser Gly Thr Thr Glu Gin Leu Pro Leu Thr 
770 775 780 

Gly Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn Cys 
785 790 795 800 

Leu Tyr Trp Ser Asp Leu Ala Leu Asp Val lie Gin Arg Leu Cys Leu 
805 810 815 

Asn Gly Ser Thr Gly Gin Glu Val He He Asn Ser Asp Leu Glu Thr 
820 825 830 

Val Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp Val 
835 840 845 - 

Asp Ala Gly Phe Lys Lys He Glu Val Ala Asn Pro Asp Gly Asp Phe 
850 855 860 

Arg Leu Thr Val Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala Leu 
865 870 875 880 

Val Leu Val Pro Gin Glu Gly He Met Phe Trp Thr Asp Trp Gly Asp 
885 890 895 

Leu Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala Tyr 
900 905 910 

Arg Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly He Ser Val Asp 
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915 920 



925 



Asp Gin Trp He Tyr Trp Thr Asp Ala Tyr Leu Asp Cys He Glu Arg 
930 935 940 

He Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Arg Leu Pro 
945 950 955 * ^ 960 

His Pro Tyr Ala He Ala Val Phe Lys Asn Glu He Tyr Trp Asp Asp 
965 970 975 

Trp Ser Gin Leu Ser He Phe Arg Ala Ser Lys Tyr Ser Gly Ser Gin 
980 985 990 

Met Glu He Leu Ala Ser Gin Leu Thr Gly Leu Met Asp Met Lys He 
995 1000 1005 

Phe Tyr Lys Gly Lys Asn Thr Gly Ser Asn Ala Cys Val Pro Arg Pro 
1010 1015 1020 

Cys Ser Leu Leu Cys Leu Pro Arg Ala Asn Asn Ser Lys Ser Cys Arg 
1025 1030 1035 1040 

Cys Pro Asp Gly Val Ala Ser Ser Val Leu Pro Ser Gly Asp Leu Met 
1045 1050 1055 

Cys Asp Cys Pro Lys Gly Tyr Glu Leu Lys Asn Asn Thr Cys Val Lys 
1060 1065 1070 

Glu Glu Asp Thr Cys Leu Arg Asn Gin Tyr Arg Cys Ser Asn Gly Asn 
1075 1080 1085 

Cys He Asn Ser He Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly Asp 
1090 1095 1100 

Met Ser Asp Glu Lys Asn Cys Pro Thr Thr He Cys Asp Leu Asp Thr 
H05 1110 1115 1120 

Gin Phe Arg Cys Gin Glu Ser Gly Thr Cys He Pro Leu Ser Tyr Lys 
1125 H30 H35 

Cys Asp Leu Glu Asp Asp Cys Gly Asp Asn Ser Asp Glu Arg His Cys 
1140 1145 H50 

Glu Met His Gin Cys Arg Ser Asp Glu Tyr Asn Cys Ser Ser Gly Met 
1155 1160 H65 

Cys He Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg Asp 
1170 H75 H80 

Trp Ser Asp Glu Ala Asn Cys Thr Ala He Tyr His Thr Cys Glu Ala 
H85 1190 H95 1200 

Ser Asn Phe Gin Cys Arg Asn Gly His Cys lie Pro Gin Arg Trp Ala 
1205 1210 1215 

Cys Asp Gly Asp Ala Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro Ala 
1220 1225 1230 

Asn Cys Glu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Gly Thr Cys 
1235 1240 1245 

lie Pro Ser Thr Lys His Cys Asp Gly Leu His Asp Cys Ser Asp Gly 
1250 1255 1260 
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Ser Asp Glu Gin His Cys Glu Pro Leu Cys Th^r Arg Ph<= Met: Asp Phe 
1265 1270 1275 1280 

Val Cys Lys Asn Arg Gin Gin Cys Leu Phe His Ser Met Vai Cys Asp 
1285 1290 1295 

Gly lie lie Gin Cys Arg Asp Gly Ser Asp Glu Asp Pro Ala Phe Ala 
1300 1305 1310 

Gly Cys Ser Arg Asp Pro Glu Phe His Lys Val Cys Asp Glu Phe Gly 
10 1315 1320 1325 

Phe Gin Cys Gin Asn Gly Val Cys lie Ser Leu lie Trp Lys Cys Asp 
1330 1335 1340 

Gly Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu Asn 
15 1345 1350 1355 1360 

Pro Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys Asp 
1365 1370 1375 



20 



25 



30 



35 



40 



45 



50 



Asn Gly His Cys lie Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn Asp 
1380 1385 1390 

Cys Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His Val Leu 
1395 1400 1405 

Pro Ser Thr Thr Pro Ala Pro Ser Thr Cys Leu Pro Asn Tyr Tyr Arg 
1410 1415 1420 

Cys Gly Gly Gly Ala Cys Val lie Asp Thr Trp Val Cys Asp Gly Tyr 
1425 1430 1435 1440 

Arg Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Ser Leu Pro 
1445 1450 1455 

Asn Val Thr Ala Thr Ser Ser Pro Ser Gin Pro Gly Arg Cys Asp Arg 
1460 1465 1470 

Phe Glu Phe Glu Cys His Gin Pro Lys Lys Cys lie Pro Asn Trp Arg 
1475 1480 1485 

Arg Cys Asp Gly His Gin Asp Cys Gin Asp Gly Gin Asp Glu Ala Asn 
1490 1495 1500 

Cys Pro Thr His Ser Thr Leu Thr Cys Met Ser Trp Glu Phe Lys Cys 
1505 1510 1515 1520 

Glu Asp Gly Glu Ala Cys lie Val Leu Ser Glu Arg Cys Asp Gly Phe 
1525 1530 1535 

Leu Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu Leu 
1540 1545 1550 

Thr Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser Gly 
1555 1560 1565 

Asn Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala Ala 
1570 1575 1580 

Cys Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp Lys 
1585 1590 1595 1600 
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Thr Leu Glu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val Leu 
1605 1610 1615 

Lys Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val Gin Cys Leu Ser 
5 1620 1625 1630 

Lys Val His Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu Gly 
1635 1640 1645 

Leu Pro Asp Ala Pro Gin Asn Leu Gin Leu Ser Leu His Gly Glu Glu 
10 1650 1655 1660 

Glu Gly Val lie Val Gly His Trp Ser Pro Pro Thr His Thr His Gly 
1665 1670 1675 1680 

Leu lie Arg Glu Tyr lie Val Glu Tyr Ser Arg Ser Gly Ser Lys Val 
15 1685 1690 1695 

Trp Thr Ser Glu Arg Ala Ala Ser Asn Phe Thr Glu lie Lys Asn Leu 
1700 1705 1710 



20 



25 



30 



35 



Leu Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser Arg 
1715 1720 1725 

Gly lie Gly Asn Trp Ser Asp Ser Lys Ser lie Thr Thr Val Lys Gly 
1730 1735 1740 

Lys Ala lie Pro Pro Pro Asn lie His lie Asp Asn Tyr Asp Glu Asn 
1745 1750 1755 1760 

Ser Leu Ser Phe Thr Leu Thr Val Asp Gly Asn lie Lys Val Asn Gly 
1765 1770 1775 

Tyr Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu Lys 
1780 1785 1790 

Lys Thr Met Asn Phe Gin Gly Ser Ser Val Ser His Lys Val Gly Asn 
1795 1800 1805 

Leu Thr Ala Gin Thr Ala Tyr Glu lie Ser Ala Trp Ala Lys Thr Asp 
1810 1815 1820 

Leu Gly Asp Ser Pro Leu Ser Phe Glu His Val Thr Thr Arg Gly Val 
1825 1830 1835 1840 

Arg Pro Pro Ala Pro Ser Leu Lys Ala Arg Ala He Asn Gin Thr Ala 
1845 1850 1855 

Val Glu Cys Thr Trp Thr Gly Pro Arg Asn Val Val Tyr Gly He Phe 
1860 1865 1870 

Tyr Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Ser Ser Leu Thr 
1875 1880 1885 

Thr Pro Leu His Asn Ala Thr Val Leu Val Gly Lys Asp Glu Gin Tyr 
1890 1895 1900 

Leu Phe Leu Val Arg Val Val Met Pro Tyr Gin Gly Pro Ser Ser Asp 
1905 1910 1915 1920 

so Tyr Val Val Val Lys Met lie Pro Asp Ser Arg Leu Pro Pro Arg His 

1925 1930 1935 

Leu His Ala Val His Thr Gly Lys Thr Ser Ala Val He Lys Trp Glu 
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1940 1941 



1950 



10 



Ser Pro Tyr Asp Ser Pro Asp Gin Asp Leu Phe Tyr Ala He Ala Val 
1955 I960 1965 . . 

Lys Asp Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser Ara 
1970 1975 1980 

Asn Ser Thr Val Glu Tyr Thr Leu Ser Lys Leu Glu Pro Gly Gly Lvs 
1985 1990 19 95 * * 2000 

Tyr His Val He Val Gin Leu Gly Asn Met Ser Lys Asp Ala Ser Val 
2005 2010 2015 

Lys He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys He He 
2020 2025 2030 

Thr Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu Lys 
2035 2040 2045 

Glu Lys Tyr Phe Asn Glu Ser Arg Gly Tyr Glu lie His Met Phe Asp 
2050 2055 2060 

Ser Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn Phe 
2065 2070 2075 2080 

Phe Lys He Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr Val 
2085 2090 2095 

Gin Ala Arg Cys Leu Leu Gly Ser Gin He Cys Gly Glu Pro Ala Val 
2100 2105 2110 

Leu Leu Tyr Asp Glu Leu Gly Ser Gly Gly Asp Ala Ser Ala Met Gin 
2115 2120 2125 

Ala Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Pro He Leu Phe 
2130 2135 2140 

Leu He Leu Leu Ser Leu Gly Val Gly Phe Ala He Leu Tyr Thr Lys 
2145 2150 2155 2160 

His Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His Tyr 
2165 2170 2175 

Ser Ser Arg Leu Gly Ser Ala He Phe Ser Ser Gly Asp Asp Leu Gly 
2180 2185 2190 

40 Giu As P As P Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp Val 

2195 2200 2205 

Pro Met Val He Ala 
2210 



25 



30 



35 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6961 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
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10 



15 



20 



25 



(ix) FEATURE: - ■ 

(A) NAME/KEY: sig peptide 

(B) LOCATION: 178. .261 

(ix) FEATURE: 

(A) NAME /KEY : mat peptide 

(B) LOCATION:262 . .6816 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CCGCGAGCCG CACACGTGAC GGCGCCGCGC CGCGCCGCGC CGCGCCGAGC GGGACCCAGC 6 0 
GGCTGCCCGG AGCCCCGGGA GCGGCGCGCG CGCGGCCCCG GCCCCGCCGC TCGGCCGGCG 120 
GCGCGCTGCA CATTCTCTCC TGGCGGCGGC GCCACCTGCA GCCGCGTTCG CCCGAACATG 180 

Met 
1 

GCG ACA CGG AGC AGC AGG AGG GAG TCG CGA CTC CCC TTC CTA TTC ACC 228 
Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe Thr 

5 10 15 

CTG GTC GCG CTG CTG CCG CCC GGG GCT CTC TGC GAG GTG TGG ACG CGG 276 
Leu Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr Arg 

20 25 30 

ACA CTG CAC GGC GGC CGC GCG CCC TTA CCC CAG GAG CGG GGC TTC CGC 3 24 
Thr Leu His Gly Gly Arg Ala Pro Leu Pro Gin Glu Arg Gly Phe Arg 

35 40 45 

GTG GTG CAG GGC GAC CCG CGC GAG CTG CGG CTG TGG GAG CGC GGG GAT 3 72 
Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Glu Arg Gly Asp 
50 55 60 65 

GCC AGG GGG GCG AGC CGG GCG GAC GAG AAG CCG CTC CGG AGG AGA CGG 420 
Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Arg Arg 

70 75 80 

AGC GCT GCC CTG CAG CCC GAG CCC ATC AAG GTG TAC GGA CAG GTC AGC 468 
Ser Ala Ala Leu Gin Pro Glu Pro He Lys Val Tyr Gly Gin Val Ser 

85 90 95 

CTC AAT GAT TCC CAC AAT CAG ATG GTG GTG CAC TGG GCC GGA GAG AAA 516 
Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu Lys 

100 105 HO 

AGC AAC GTG ATC GTG GCC TTG GCC CGG GAC AGC CTG GCG TTG GCC AGG 564 
Ser Asn Val He Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala Arg 

115 120 125 _ _ 

CCC AGG AGC AGT GAT GTG TAC GTG TCT TAT GAC TAT GGA AAA TCA TTC 612 
Pro Arg Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser Phe 
130 135 140 145 

AAT AAG ATT TCA GAG AAA TTG AAC TTC GGC GCG GGA AAT AAC ACA GAG 660 
Asn Lvs He Ser Glu Lys Leu Asn Phe Gly Ala Gly Asn Asn Thr Glu 

150 155 160 

GCT GTG GTG GCC CAG TTC TAC CAC AGC CCT GCG GAC AAC AAA CGG TAC 708 
Ala Val Val Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arg Tyr 

165 I 70 175 

ATC TTC GCA GAT GCC TAC GCC CAG TAT CTC TGG ATC ACG TTT GAC TTC 756 
He Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp He Thr Phe Asp Phe 

180 185 I 90 

TGC AAC ACC ATC CAT GGC TTT TCC ATC CCG TTC CGG GCA GCT GAT CTC 804 
Cys Asn Thr He His Gly Phe Ser He Pro Phe Arg Ala Ala Asp Leu 
45 195 200 205 

CTA CTC CAC AGT AAG GCC TCC AAC CTT CTC CTG GGC TTC GAC AGG TCT 8 52 
Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arg Ser 

^C CCC AAC AAG CAG CTG TGG AAG TCG GAT GAT TTT GGC CAG ACC TGG 900 

His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr Trp 

50 230 235 240 

ATC ATG ATT CAA GAA CAC GTG AAG TCC TTT TCT TGG GGA ATT GAT CCC 948 

He Met He Gin Glu His Val Lys Ser Phe Ser Trp Gly He Asp Pro 
245 250 255 
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30 



35 



TAT GAC AAA CCA AAC ACC ATC TAC ATC GAA ~GG CAP r- a -r^-n 
Tyr Asp Lys Pro Asn Thr Ile £ SJ j£ £S G?u £ S £y 

265 270 



S E S fS S 2S S S S E £ 2S E « «K 2J 



tfl TAC ATG TTT GCT ACA AAG GTG GTG CAT CTC TTG arr ar-r ^ ™~ 305 

Ty, „« Phe Ala Thr ly. v.l v.l SI S£ S S 

E E S£ S S SSEE-S £ g K » S 

S §K £ vIT S S K S S 2£ £S £ £ £ S 2S 123S 

345 350 



GCC TCG GAG GAC CAG GTG TTT GTG TGT GTC ACT CAC AGC AAC AAC CGC 
Ala Ser Glu Asp Gin Val Phe Val Cys Val Ser His Ser £n A^n Sg 

J:):> 360 365 3 



Thr A.,n Leu Tyr lie Ser Glu Ala STu l£ ^ ^ S J£ 

CTG GAG AAC GTG CTC TAC TAC ACC CCG GGA GGG GCC GGC ACT GAC Acr 
Leu Glu Asn val Leu Tyr Tyr Thr Pro Gly Gly Ala sS Sp Thr 

TTG GTG AGG TAC TTT GCA AAT GAA CCG TTT GCT GAC TTC CAT CGT CTC 
Leu Val Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp p£e Ss Sg S? 

5 410 41 c 

GAA CCG TTG CAG GGA GTC TAC ATT GCT ACT CTG ATT AAT GOT TCT ATC 
Glu Gly Leu Gin Gly Val Tyr lie Ala Thr Leu lie HI ^ Met 
'•« w 425 43 0 



GGC GGC ATC ATC ATG GCC ATT GCC CAA GGC ATG GAA ACC AAC GAA CTG 
Gly Gly lie He Met Ala lie Ala Gin Gly Met Sll Thr Si 



550 555 c/Tf, 

AAG TAC ACT ACC AAC GAA GGG GAG ACC TGG AAA GCC TTC ACC TTC TCT 
I*s Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Ala She tS J£ sS 

->oo 570 C7C 

GAG AAG CCC GTG TTT GTG TAT GGG CTC CTC ACG GAA CCC GGC GAG AAG 
Glu Lys Pro Val Phe Val Tyr Gly Leu Leu Thr Glu Pro Gly otu L^s 

80 585 590 

AGC ACG GTC TTC ACC ATC TTT GGC TCC AAC AAG GAG AAC GTG CAC AGC 



996 
1044 
1092 
1140 
1188 



1284 



ACC AAC CTC TAC ATC TCG GAG GCA GAG GGC TTG AAG TTC TCT CTG TCC mo 
20 Thr Asn Leu Tvr Tie <W «i„ ai, t „. VTZ L ™ CTG TCC 133 2 



1380 
1428 
1476 



AAT GAG GAG AAC ATG AGA TCT GTC ATC ACC TTT GAC AAA GGG GCC arr 
Asn Glu Glu Asn Met Arg Ser Val lie Thr p2 Asp JJJ Thr 1524 



1572 
1620 



TGG GAA TTT CTG CAG GCT CCA GCC TTC ACG GGG^TAT GGA GAG AAA ATC 
Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr gTJ S! Lys S 

AAC TGT GAG CTG TCC GAG GGC TGT TCC CTC CAC CTG GCC CAG CGC CTC 
Asn cys Glu Leu Ser Glu Gly Cys Ser Leu His Leu SSa Sn 2g Leu 

AGC CAG CTG CTC AAC CTC CAG CTC CGG i£§ ATG CCC ATC CTG TCC AAG 1668 
Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro Ile Leu Ser Lys 

£2 I£ OT . C A 7 T ^ «« TCA GTG cJa AAG AAC 



1716 



Glu Ser Ala Pro Gly Leu xii He ?hr £y Ser Va? £y ^ ^ 

o GC f* 5 A P GTG TAC ATC TCT A °C ACT GCT GGA GCC AGG 1764 

Leu Ala Ser Lys Thr Asn Val Tyr lie Ser Ser Ser Ala Gly Ala Ar? 

520 525 
TGG CGA GAG GCA CTT CCT GGA CCT CAC TAC TAT ACA TGG GGA GAC CAT lfil , 
Trp Arg Glu Ala Leu Pro Gly Pro His Tyr Tyr Thr S? 2£ £J £J 1812 
^ 5 „„„ 535 540 545 



1860 
1908 
1956 
2004 
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Ser Thr Val Phe Thr lie Phe Gly Ser Asn Lys Glu Asn Va^ His -3er 

595 600 605 

TGG CTC ATC CTC CAG GTC AAT GCC ACA GAC GCC CTG GGG GTT CCT TGC 2052 
Trp Leu He Leu Gin Val Asn Ala Thr Asp Ala Leu Gly Val Pro Cys 
610 615 620 625 

ACA GAG AAC GAC TAC AAG CTC TGG TCA CCA TCT GAT GAG CGG GGG AAT 2100 
Thr Glu Asn Asp Tyr Lys Leu Trp Ser Pro Ser Asp Glu Arg Gly Asn 
n cm 64 0 



630 



GAG TGT TTG CTT GGA CAC AAG ACT GTT TTC AAA CGG AGG ACC CCG CAC 2148 
Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Arg Thr Pro His 
10 64 5 650 655 

GCC ACA TGC TTT AAC GGA GAA GAC TTT GAC AGG CCG GTG GTT GTG TCC 2196 
Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val Ser 

660 665 670 

AAC TGC TCC TGC ACC CGG GAG GAC TAT GAG TGT GAC TTT GGC TTC CGG 2244 
Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe Arg 
15 675 ^80 685 

ATG AGT GAA GAC TTG GCA TTA GAG GTG TGT GTT CCA GAT CCA GGA TTT 22 92 

Met Ser Glu Asp Leu Ala Leu Glu Val Cys Val Pro Asp Pro Gly Phe 

fi q n 695 700 705 

TCT GGA AAG TCC TCC CCT CCA GTG CCT TGT CCC GTG GGC TCT ACG TAC 2340 

Ser Gly Lys Ser Ser Pro Pro Val Pro Cys Pro Val Gly Ser Thr Tyr 

20 AGG CGA TCA AGA GGC TAC CGG AAG ATT TCT GGG GAC ACC TGT AGT GGA 2388 

Arcr Arg Ser Arg Gly Tyr Arg Lys lie Ser Gly Asp Thr Cys Ser Gly 

725 730 735 

GGA GAT GTT GAG GCA CGG CTA GAA GGA GAG CTG GTC CCC TGT CCC CTG 2436 

Gly Asp Val Glu Ala Arg Leu Glu Gly Glu Leu Val Pro Cys Pro Leu 

740 745 750 

GCA GAA GAG AAC GAG TTC ATC CTG TAC GCC ACG CGC AAG TCC ATC CAC 2484 
Ala Glu Glu Asn Glu Phe He Leu Tyr Ala Thr Arg Lys Ser He His 

CGC TAT GAC CTG GCT TCC GGA ACC ACG GAG CAG TTG CCC CTC ACT GGG 2532 
Arg Tyr Asp Leu Ala Ser Gly Thr Thr Glu Gin Leu Pro Leu Thr Gly 
770 775 780 785 

TTG CGG GCA GCA GTG GCC CTG GAC TTT GAC TAT GAG CAC AAC TGC CTG 2580 
Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn Cys Leu 

790 795 80° 

TAT TGG TCT GAC CTG GCC TTG GAC GTC ATC CAG CGC CTC TGT TTG AAC 2628 
Tyr Trp Ser Asp Leu Ala Leu Asp Val lie Gin Arg Leu Cys Leu Asn 

805 81° 815 

GGG AGT ACA GGA CAA GAG GTG ATC ATC AAC TCT GAC CTG GAG ACG GTA 2676 
Gly Ser Thr Gly Gin Glu Val He He Asn Ser Asp Leu Glu Thr Val 

820 8 25 830 

GAA GCT TTG GCT TTT GAA CCC CTC AGC CAA TTA CTT TAC TGG GTG GAC 2724 
Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp Val Asp 

835 840 845 

GCA GGC TTT AAA AAG ATC GAG GTA GCC AAT CCA GAT GGT GAC TTC CGA 2772 
Ala Gly Phe Lys Lys He Glu Val Ala Asn Pro Asp Gly Asp Phe Arg 
ocn 855 860 8b = 

CTC ACC GTC GTC AAT TCC TCG GTG CTG GAT CGG CCC CGG GCC CTG GTC 2820 
Leu Thr Val Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala Leu Val 

870 8 75 88 

CTT GTG CCC CAA GAA GGG ATC ATG TTC TGG ACC GAC TGG GGA GAC CTG 
Leu Val Pro Gin Glu Gly He Met Phe Trp Thr Asp Trp Gly Asp Leu 

885 890 895 

AAG CCT GGG ATT TAT CGG AGC AAC ATG GAC GGA TCT GCC GCC TAT CGC 
Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala Tyr Arg 

900 905 910 

CTC GTG TCG GAG GAT GTG AAG TGG CCC AAT GGC ATT TCC GTG GAC GAT 
Leu val Ser Glu Asp Val Lys Trp Pro Asn Gly He Ser Val Asp Asp 

915 920 925 

CAG TGG ATC TAC TGG ACG GAT GCC TAC CTG GAC TGC ATT GAG CGC ATC 3012 
Gin Trp He Tyr Trp Thr Asp Ala Tyr Leu Asp Cys He Glu Arg He 

55 
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930 935 w40 

% E K Sg fZ SS S2E ?S S 2£ £ E 25 SS - 
E 3! S E S S S S E £ £ E & e - K »» 

70 985 QQrt 

GAG ATT CTG GCC AGC CAG CTC ACG GGG CTG ATG GAG ATG AAG ATC TTC 
Glu lie Leu Ala Ser Gin Leu Thr Gly Leu Met Asp Met JJJ lie S£ 

K S K £ if sFe S K SI SS £ 5£ E 
£ E S E 2£ S S E 2S £2" E E S IF MM 

CCA GAT GGC GTG GCC AGC AGT GTC CTC CCT^CC GGG GAC CTG ATC°tgt 
Pro Asp Gly Va^Ala Ser Ser Val Le^Pro Ser Gly Asp Su S2 gs 

20 ^ £? C ^ ^ G G ? C TAC GA ° CTG A^AAC AAC ACG TGT GTC^AAA GAA 

P ^ iOfio LyS Gly ^ G1U Leu Lys Asn Asn Thr Cys Val JJi c^ 

1065 1070 

rt^ A u C I GT CTG CGC ^ C ^ TAC CGC TGC AGC AAC GGG AAC TGC 

Glu Asp^Thr Cys Leu Arg Asn^Gln Tyr Arg Cy S Ser Asn G^ itn gs 

25 t£ ^ tT° 2°° TGG TGC GAT ^ ^ ^ GA? 5 TGC GGA GAC ATG 34 92 

lie Asn Ser He Trp Trp Cys Asp Phe Asp Asn Asp Cys GlJ Asp J3 3492 

1100 _ _ ^ _ 

AGC GAC GAG AAG AAC TGC CCT ACC ACC ATC TGC GAC CTG GAC ACC CAG « An 
Ser Asp Glu Lys Asn Cys Pro Thr Thr lie Cys Asp LeS Asp i£ S£ 354 ° 



3343 



3395 



3444 



3588 
3636 



1110 1115 i t on 

TTC CGT TGC CAG GAG TCT GGG ACG TGC ATC CCG CTC TCC TAC AAA TGT 
Phe Arg Cys Gin Glu Ser Gly Thr Cys lie Pro Leu Ser Tyr Lys Cys 

1125 1130 ii-,r 1 

GAC CTC GAG GAT GAC TGT GGG GAC AAC AGT GAC GAA AGG CAC TGT GAA 
^u GluAsp Asp Cys Gly As^Asn Ser Asp Glu Arl hTs Glu 

mI G ^ G I GC CGG AGC GAC Ga1 5 TAC AAC TGC AGC TC^GGC ATG TGC 

Met His Gin Cys Arg Ser Asp Glu Tyr Asn Cys Ser Ser G^ Set <gs 

1160 1165 

S 25 E E E S E SI S =S E 2 P E £ 2 P E 3732 
E 25 SS S E E £ E S E EE SI S E " s ° 



1190 1195 i onn 

A^n Se m n ^ 2°° ^ GGG °S C TGC ATC CCC CAG CGG TGG GCG TGT 
Asn Phe Gin Cys^Arg Asn Gly His Cys^Ile Pro Gin Arg Trp Ala Cys 

GAT ( 
Asp 1 

Sa ^ ™S ^ C 5? C ^" CGC TGC COT AAC S£° A CC TGC ATT 



3828 



I? 05 1210 1215 

K5C TCT GAT GAG GAT 

c — wj-i* nau » /n _ _ - 

1220 12 2S 



°? C GA ° GCC ^ TGC ^ Gft T GGC TCT GAT GAG GAT CCA GCC AAC 387S 
Asp Gly Asp Ala Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro JS A^n 



rs,« f* n T , ^ ^-wt iva^ CL-Lj AAL CjGC ACC TGC ATT 

Cys Gl^Lys Lys Cys Asn Jl^Phe Arg Cys Pro Asn Gly Thr Cys SI 

P^ C o CC ACC ^ ^C TGT GAC GGC CTG CAC GAT TGC^TCG GAC GGC TCC 
Proper Thr Lys His Cys^Asp Gly Leu His Asp gs Ser Sp ler 

GAC GAG CAG CAC TGC GAG CCC CTG TGT ACA CGG^TTC ATG GAC TTC GTG^ 
Asp Glu Gin His Cys Glu Pro Leu Cys Thr Arg pie Set J£ S3 
1270 1275 1280 



3924 
3972 
4020 
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TGT AAG AAC CGC CAG CAG TGC CTC TTC CAC TCC ATG CTG TGC GAT GGG 4 06 3 
Cvs Lys Asn Arg Gin Gin Cy S Leu Phe his Ser y et Vc.1 Cys Asp Gly 

1285 1290 1295 

ATC ATC CAG TGC CGT GAC GGC TCC GAC GAG GAC CCA GCC TTT GCA . GGA 4116 
lie He Gin Cys Arg Asp Gly Ser Asp Glu Asp Pro Ala Phe Ala Gly 

1300 1305 1310 

TGC TCC CGA GAC CCC GAG TTC CAC AAG GTG TGC GAT GAG TTC GGC TTC 4164 
Cys Ser Arg Asp Pro Glu Phe His Lys Val Cys Asp Glu Phe Gly Phe 

1315 1320 1325 

CAG TGT CAG AAC GGC GTG TGC ATC AGC TTG ATC TGG AAG TGC GAC GGG 4212 



CAG TGT (JAU AAt * *~" rtVJV - iiW *™ "™ 

Gin Cys Gin Asn Gly Val Cys He Ser Leu lie Trp Lys Cys Asp Gly 
13 30 1335 1340 134. 

ATG GAT GAC TGC GGG GAC TAC TCC GAC GAG GCC AAC TGT GAA AAC CCC 42 6 0 
Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu Asn Pro 

1350 1355 1360 

ACA GAA GCC CCC AAC TGC TCC CGC TAC TTC CAG TTC CGG TGT GAC AAT 4 3 08 
15 Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys Asp Asn 

1365 1370 1375 

GGC CAC TGC ATC CCC AAC AGG TGG AAG TGT GAC AGG GAG AAT GAC TGT 4 3 56 
Glv His Cvs He Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn Asp Cys 

1380 1385 1390 

GGG GAC TGG TCC GAC GAG AAG GAC TGT GGA GAT TCA CAT GTA CTT CCG 4404 
Glv Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His Val Leu Pro 

1395 1400 1405 

TCT ACG ACT CCT GCA CCC TCC ACG TGT CTG CCC AAT TAC TAC CGC TGC 44 52 
Ser Thr Thr Pro Ala Pro Ser Thr Cys Leu Pro Asn Tyr Tyr Arg Cys 
1410 1415 1420 1425 

GGC GGG GGG GCC TGC GTG ATA GAC ACG TGG GTT TGT GAC GGG TAC CGA 4500 
Glv Glv Gly Ala Cys Val lie Asp Thr Trp Val Cys Asp Gly Tyr Arg 
25 J 1430 1435 1440 

GAT TGC GCA GAT GGA TCC GAC GAG GAA GCC TGC CCC TCG CTC CCC AAT 4548 
Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Ser Leu Pro Asn 

* 1445 1450 1455 

GTC ACT GCC ACC TCC TCC CCC TCC CAG CCT GGA CGA TGC GAC CGA TTT 4596 
Val Thr Ala Thr Ser Ser Pro Ser Gin Pro Gly Arg Cys Asp Arg Phe 
30 1460 1465 1470 

GAG TTT GAG TGC CAC CAG CCA AAG AAG TGC ATC CCT AAC TGG AGA CGC 4 644 
Glu Phe Glu Cys His Gin Pro Lys Lys Cys lie Pro Asn Trp Arg Arg 

1775 1480 1485 

TGT GAC GGC CAT CAG GAT TGC CAG GAT GGC CAG GAC GAG GCC AAC TGC 4 6 92 
Cys Asp Gly His Gin Asp Cys Gin Asp Gly Gin Asp Glu Ala Asn Cys 
« 1490 1495 1500 1505 

CCC ACT CAC AGC ACC TTG ACC TGC ATG AGC TGG GAG TTC AAG TGT GAG 
Pro Thr His Ser Thr Leu Thr Cys Met Ser Trp Glu Phe Lys Cys Glu 

1510 1515 1520 

GAT GGC GAG GCC TGC ATC GTG CTG TCA GAA CGC TGC GAC GGC TTC CTG 



4740 



4788 



GAT GGC UA<J bLU JHj*~ aiw *w» ™~ - - - 

Asp Gly Glu Ala Cys He Val Leu Ser Glu Arg Cys Asp Gly Phe Leu 
1525 1530 1535 



GAC TGC TCA GAT GAG AGC GAC GAG AAG GCC TGC AGT GAT GAG TTA ACT 4836 
Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu Leu Thr 

1540 1545 1550 

GTA TAC AAA GTA CAG AAT CTT CAG TGG ACA GCT GAC TTC TCT GGG AAT 4884 
Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser Gly Asn 
1555 1560 1565 

45 GTC ACT TTG ACC TGG ATG CGG CCC AAA AAA ATG CCC TCT GCT GCT TGT 4 932 

Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala Ala Cys 
1570 1575 1580 1585 

GTA TAC AAC GTG TAC TAT AGA GTT GTT GGA GAG AGC ATA TGG AAG ACT 4 980 
Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp Lys Thr 
1590 1595 1600 

50 CTG GAG ACT CAC AGC AAT AAG ACA AAC ACT GTA TTA AAA GTG TTG AAA 5028 

Leu Glu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val Leu Lye 

1605 1610 1615 

CCA GAT ACC ACC TAC CAG GTT AAA GTG CAG GTT CAG TGC CTG AGC AAG 5076 
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°" I£ 0 Thr ^ Gl » ™ v-i «i om ..v, i*,, s» ly . 

SS E2 S i£ 2S S SS S 2 S S £ « 5.2Z 
S 2S S £ SK EE S S £ E & SI 



5124 
5172 
5220 
5268 
5316 
5364 
5412 

S2 S£ S£ «5 ? AC ^ V OAT GAA AAT TCc" 



™ 1 ? 40 1745 

_ TCC 

Ile As P Asn Tyr Asp Glu Asn Ser 
1750 1755 



Ala He Pro Pro Pro Asn lie hIs n e £o Stt ™ ™ 5460 

5508 
5556 



„ ss ss k S3 §™ s s s i°„- s His sj i 5 

S £ SK £S S SS £ ESS Sg S S 

ACT TCA GAA AGG GCT GCT AGT AAC TTTACA GAA ATA AAG AA^TTG TTr 
Thr ser oiuArg Ala Ala Ser Asn_Phe Thr Glu lie %s ten leu ™ 

w T ? ^ C ~u C ?* G TAC ACC GTC AGA^GTG GCT GCG GTG ACG^AGT CGT GGG 

17? 5 hr LSU ^ Thr Val Val Ala Ala Va l Thr sSr Sg ^ 

x/iD 1720 1725 

tT A ^ AAC 2°° AGC GAT TCC TCC AT T ACC ACC GTG AAA GGA AAA 

Ile^Gly Asn Trp Ser Asp^Ser Lys Ser lie Thr Jhr Val S£ 

AAT 
Asn 

CTG AGT TTT ACC CTg'acC GTG GAT GGG AAc'atC AAG GTG AAT GGC°TAT 
Leu Ser Phe Thr Leu Thr Val Asp Gly Asn lie Lys Val ten Gly SJ 

1765 1770 T77C 

GTG GTG AAC CTT TTC TGG GCA TTT GAC ACC CAC AAA CAA B»r i«- 
Val Val As^Leu Phe Trp Ala Ph^Asp Thr nts L^ Gin Su L^s Lys 

Thr Sit %Z ^ ^ AGC ™*<™ TCC CAC AAA GTT°GGC AAT CTG 

Thr Met^Asn Phe Gin Gly S«Ser Val Ser His Lys Val Gly ten 2S 

t?" ^ G A P °? C TAT GAG ATT TCC GCC TGG GCC 5 AAG ACT GAC TTG 
TtoAl. Gin Thr Ala Ty^Glu lie Ser Ala Trp Ala Lys Thr A^p Leu 

As* ^ T P° T ^ GAG ^ GTC A ' G ° ACC A °A GGG GTT 

Gly Asp Ser Pro Leaser Phe Glu His Val Thr Thr Arg Gly Val Arg 

35 CCA CCT GCT CCT AGC CTC AAG GCC AGG GCT^ATC AAT CAG ACT GcTgTG 

Pro Pro Ala Proper Leu Lys Ala Ar^Ala He tel G^n Thr aS 25 

GAA TGC ACC TGG ACA GGC CCC AGG AMOKS GTG TAT GGC ATT^TTC TAT 
073 I5|o TrP ^ ^fs^ Val Val ^ S Phe T?r 

SS Thr Ser P^e £ G J™ ^ CCA AGC AG ^CTG ACC ACG 

1875 P ifso ^ ASn Pr ° SSr SSr LeU Thr Thr 

CCG CTG CAC AAC GCA ACC GTG CTC GTC GGT AAG GAT^GAG CAG TAT nr 
PrO Q Leu Hxs Asn Ala ThrVM. Leu Val Gly Lys ^ S£ T^r Su 

p£ t CTG w TC CGG GTG GTG ATG CCC TAC CAA G^°CCG TCC TCG GAC TA^ 
Phe Leu Val Arg Valval Met Pro Tyr Gin Gly Pro Ser ler Asp T^r 

S3 vl? vJ? ^ A I° tT° EE? GAC AGC A ^CTT CCT CCC CGG CA^CTG 



Val tr-o ir -i ; tTT CCT CCC CGG CAC CTG 

Val Val Val Lys^Met lie Pro Asp Ser^Arg Leu Pro Pro Arg His 2S 

CAT GCC GTT CAC ACC GGC AAG ACC tS°gCC GTC ATC AAG TGG ^ GAG TCC 
Hxs Ala Val^His Thr Gly Lys jhr s#r Ma Val JJJ JJJ «J «J J» 

? AC TCT CCT GAC CAG GAC 5 CTG TTC TAT GCG ATC^GCA GTT AAA 
Pro Tyr Asp Ser Pro Asp Gin Asp Leu Phe Tyr Ala lie a£ VaT J£ 



5604 

5652 

5700 

5748 

5796 

5844 

5892 

5940 

5988 



SI Ala ^ Sff ^ G ?5 ^ ACC ~? ? CC - GT ? A T C *~ tIg 5 ^ TCO 603S 

6084 

^cu rne iyr Ala lie Ala Val Lys 
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1955 1960 1S65 

GAT CTG ATA CGA AAG ACG GAC CGG AGC TAC AAA ?TC AAG TCC CGC AAC 6132 
Asp Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser Arg Asn 
_ 1970 1975 1980 . -1985 

AGC ACC GTG GAG TAC ACC CTG AGC AAG CTG GAG CCC GGA GGG AAA TAC 6180 
Ser Thr Val Glu Tyr Thr Leu Ser Lys Leu Glu Pro Gly Gly Lys Tyr 

1990 1995 2000 

CAC GTC ATT GTG CAG CTG GGG AAC ATG AGC AAA GAT GCC AGT GTG AAG 6228 
His Val He Val Gin Leu Gly Asn Met Ser Lys Asp Ala Ser Val Lys 
2005 2010 2015 

10 ATC ACC ACC G TT TCG TTA TCG GCA CCC GAT GCC TTA AAA ATC ATA ACA 6276 

He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys He He Thr 

2020 2025 2030 

GAA AAT GAC CAC GTC CTT CTC TTC TGG AAA AGT CTA GCT CTA AAG GAA 63 24 
Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu Lys Glu 
2035 2040 2045 

'5 AAG TAT TTT AAC GAA AGC AGG GGC TAC GAG ATA CAC ATG TTT GAT AGC 6372 

Lys Tyr Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe Asp Ser 
2050 2055 2060 2065 

GCC ATG AAT ATC ACC GCA TAC CTT GGG AAT ACT ACT GAC AAT TTC TTT 6420 
Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn Phe Phe 

2070 2075 2080 

AAA ATT TCC AAC CTG AAG ATG GGT CAC AAT TAC ACA TTC ACG GTC CAG 6468 
Lys He Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr Val Gin 

2085 2090 2095 

GCA CGA TGC CTT TTG GGC AGC CAG ATC TGC GGG GAG CCT GCC GTG CTA 6516 
Ala Arg Cys Leu Leu Gly Ser Gin He Cys Gly Glu Pro Ala Val Leu 

2100 2105 2110 

CTG TAT GAT GAG CTG GGG TCT GGT GGC GAT GCG TCG GCG ATG CAG GCT 6564 
Leu Tyr Asp Glu Leu Gly Ser Gly Gly Asp Ala Ser Ala Met Gin Ala 

2H5 2120 2125 

GCC AGG TCT ACT GAT GTC GCC GCC GTG GTG GTG CCC ATC CTG TTT CTG 6612 
Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Pro He Leu Phe Leu 
2130 2135 2140 2145 

ATA CTG CTG AGC CTG GGG GTC GGG TTT GCC ATC CTG TAC ACG AAG CAT 6660 
30 Ile Leu Leu Ser L eu Gly Val Gly Phe Ala He Leu Tyr Thr Lys His 

2150 2155 2160 

CGG AGG CTG CAG AGC AGC TTC ACC GCC TTC GCC AAC AGC CAC TAC AGC 6708 
Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His Tyr Ser 

2165 2170 2175 

TCC AGA CTC GGC TCC GCC ATC TTC TCC TCT GGG GAT GAC TTG GGG GAG 6756 
35 ser Arg Leu Gly Ser Ala Ile Phe Ser Ser Gly Asp Asp Leu Gly Glu 

2180 2185 2190 

GAT GAT GAA GAT GCT CCT ATG ATC ACT GGA TTT TCG GAC GAC GTC CCC 6804 
Asp Asp Glu Asp Ala Pro Met Ile Thr Gly Phe Ser Asp Asp Val Pro 

2195 2200 2205 

ATG GTG ATA GCC TGAAAGAGCT TTCCTCACTA GAAACCAAAT GGTGTAAATA 6856 
40 Met Val He Ala 

2210 

TTTTATTTGA TAAAGATAGT TGATGGTTTA TTTTAAAAGA TGCACTTTGA GTTGCAATAT 6 916 
GTTATTTTTA TATGGGCCAA AAACAAAAGC AAAAAAAAAA AAAAA 6961 



25 



45 



SO 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

( C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA to mRNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 
ATATCCACAT TGACAGCTAT GGTGAAAATT ATCTAAGCTT CACCCTGACC ATGGAGAGTG 
ATAT CAAGGT GAATGGCTAT GTGGTGAACC TTTTCTGGGC ATTTGACACC CACAAGCAAG 
AGAGGAGAAC TTTGAACTTC CGAGGAAGCA TATTGTCACA CAAAGTTGGC AATCTGACAG 
CTCATACATC CTATGAGATT TCTGCCTGGG CCAAGACTGA CTTGGGGGAT AGCCCTCTGG 
CATTTGAGCA TGTTATGACC AGAGGGGTTC GCCCACCTGC ACCTAGCCTC AAGGCCAAAG 

(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6642 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
25 ATGGCGACAC GGAGCAGCAG GAGGGAGTCG CGACTCCCGT TCCTATTCAC CCTGGTCGCA 

CTGCTGCCGC CCGGAGCTCT CTGCGAAGTC TGGACGCAGA GGCTGCACGG CGGCAGCGCG 
CCCTTGCCCC AGGACCGGGG CTTCCTCGTG GTGCAGGGCG ACCCGCGCGA GCTGCGGCTG 
TGGGCGCGCG GGGATGCCAG GGGGGCGAGC CGCGCGGACG AGAAGCCGCT CCGGAGGAAA 
CGGAGCGCTG CCCTGCAGCC OGAGCCCATC AAGGTGTACG GACAGGTTAG TCTGAATGAT 
TCCCACAATC AGATGGTGGT GCACTGGGCT GGAGAGAAAA GCAACGTGAT CGTGGCCTTG 
GCCCGAGATA GCCTGGCATT GGCGAGGCCC AAGAGCAGTG ATGTGTACGT GTCTTACGAC 
TATGGAAAAT CATTCAAGAA AATTTCAGAC AAGTTAAACT TTGGCTTGGG AAATAGGAGT 
GAAGCTGTTA TCGCCCAGTT CTACCACAGC CCTGCGGACA ACAAGCGGTA CATCTTTGCA 
GACGCTTATG CCCAGTACCT CTGGATCACG TTTGACTTCT GCAACACTCT TCAAGGCTTT 
TCCATCCCAT TTCGGGCAGC TGATCTCCTC CTACACAGTA AGGCCTCCAA CCTTCTCTTG 
GGCTTTGACA GGTCCCACCC CAACAAGCAG CTGTGGAAGT CAGATGACTT TGGCCAGACC 
TGGATCATGA TTCAGGAACA TGTCAAGTCC TTTTCTTGGG GAATTGATCC CTATGACAAA 
CCAAATACCA TCTACATTGA ACGACACGAA CCCTCTGGCT ACTCCACTGT CTTCCGAAGT 
ACAGATTTCT TCCAGTCCCG GGAAAACCAG GAAGTGATCC TTGAGGAAGT GAGAGATTTT 
CAGCTTCGGG ACAAGTACAT GTTTGCTACA AAGGTGGTGC ATCTCTTGGG CAGTGAACAG 
CAGTCTTCTG TCCAGCTCTG GGTCTCCTTT GGCCGGAAGC CCATGAGAGC AGCCCAGTTT 
GTCACAAGAC ATCCTATTAA TGAATATTAC ATCGCAGATG CCTCCGAGGA CCAGGTGTTT 



60 
120 

180 
240 
300 



55 



86 



EP 0 773 290 A2 



10 



20 



25 



GTGTGTGTCA GCCACAGTAA CAACCGCACC AATTTATACA TCTCAGAGGC AGAGGGGCTG 1140 

AAGTTCTCCC TGTCCTTGGA GAACGTGCTC TATTACAGCC CAGGAGGGGC CGGCAGTGAC 1200 

ACCTTGGTGA GGTATTTTGC AAATGAACCA TTTGCTGACT TCCACCGAGT GGAAGGATTG 126 0 

CAAGGAGTCT ACATTGCTAC TCTGATTAAT GGTT CTATGA ATGAGGAGAA CATGAGATCG 1320 

GTCATCACCT TTGACAAAGG GGGAACCTGG GAGTTTCTTC AGGCTCCAGC CTTCACGGGA 13 80 

TATGGAGAGA AAATCAATTG TGAGCTTTCC CAGGGCTGTT CCCTTCATCT GGCTCAGCGC 144 0 

CTCAGTCAGC TCCTCAACCT CCAGCTCCGG AGAATGCCCA TCCTGTCCAA GGAGTCGGCT 1500 

CCAGGCCTCA TCATCGCCAC TGGCTCAGTG GGAAAGAACT TGGCTAGCAA GACAAACGTG 1560 

15 TACATCTCTA GCAGTGCTGG AGCCAGGTGG CGAGAGGCAC TTCCTGGACC TCACTACTAC 1620 

ACATGGGGAG ACCACGGCGG AATCATCACG GCCATTGCCC AGGGCATGGA AACCAACGAG 1680 

CTAAAATACA GTACCAATGA AGGGGAGACC TGGAAAACAT TCATCTTCTC TGAGAAGCCA 1740 

GTGTTTGTGT ATGGCCTCCT CACAGAACCT GGGGAGAAGA GCACTGTCTT CACCATCTTT 180 0 

GGCTCGAACA AAGAGAATGT CCACAGCTGG CTGATCCTCC AGGTCAATGC CACGGATGCC 186 0 

TTGGGAGTTC CCTGCACAGA GAATGACTAC AAGCTGTGGT CACCATCTGA TGAGCGGGGG 1920 

AATGAGTGTT TGCTGGGACA CAAGACTGTT TTCAAACGGC GGACCCCCCA TGCCACATGC 1980 

TTCAATGGAG AGGACTTTGA CAGGCCGGTG GTCGTGTCCA ACTGCTCCTG CACCCGGGAG 2040 

GACTATGAGT GTGACTTCGG TTTCAAGATG AGTGAAGATT TGTCATTAGA GGTTTGTGTT 2100 

CCAGATCCGG AATTTTCTGG AAAGTCATAC TCCCCTCCTG TGCCTTGCCC TGTGGGTTCT 216 0 
ACTTACAGGA GAACGAGAGG CTACCGGAAG ATTTCTGGGG ACACTTGTAG CGGAGGAGAT 222 0 
GTTGAAGCGC GACTGGAAGG AGAGCTGGTC CCCTGTCCCC TGGCAGAAGA GAACGAGTTC 228 0 
ATTCTGTATG CTGTGAGGAA ATCCATCTAC CGCTATGACC TGGCCTCGGG AGC CACCGAG 2 34 0 

CAGTTGCCTC TCACCGGGCT ACGGGCAGCA GTGGCCCTGG ACTTTGACTA TGAG CACAAC 2400 
TGTTTGTATT GGTCCGACCT GGC CTTGGAC GTCATCCAGC GCCTCTGTTT GAATGGAAGC 24 60 

ACAGGGCAAG AGGTGATCAT CAATTCTGGC CTGGAGACAG TAGAAGCTTT GGCTTTTGAA 2520 
CCCCTCAGCC AGCTGCTTTA CTGGGTAGAT GCAGGCTTCA AAAAGATTGA GGTAGCTAAT 2580 
CCAGATGGCG ACTTCCGACT CACAATCGTC AATTCCTCTG TGCTTGATCG TCCCAGGGCT 264 0 
CTGGTCCTCG TGCCCCAAGA GGGGGTGATG TTCTGGACAG ACTGGGGAGA CCTGAAGCCT 2700 
45 GGGATTTATC GGAGCAATAT GGATGGTTCT GCTGCCTATC ACCTGGTGTC TGAGGATGTG 2760 

AAGTGGC CCA ATGGCATCTC TGTGGACGAC CAGTGGATTT ACTGGACGGA TGCCTACCTG 2 820 
GAGTGCATAG AGCGGATCAC GTTCAGTGGC CAGCAGCGCT CTGTCATTCT GGACAACCTC 2880 
50 CCGCACCCCT ATGCCATTGC TGTCTTTAAG AATGAAATCT ACTGGGATGA CTGGTCACAG 2940 

CTCAGCATAT TCCGAGCTTC CAAATACAGT GGGTCCCAGA TGGAGATTCT GGCAAACCAG 3 000 
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CTCACGGGGC TCATGGACAT GAAGATTTTC TACAAf 5GGGA AGAACACTGG AAGCAATGCC 
TGTGTGCCCA GGCCATGCAG CCTGCTGTGC CTGCCCAAGG CCAACAACAG TAGAAGCTGC 
AGGTGTCCAG AGGATGTGTC CAGCAGTGTG CTTCCATCAG GGGACCTGAT GTGTGACTG C 
CCTCAGGGCT ATCAGCTCAA GAACAATACC TGTGTCAAAG AAGAGAACAC CTGTCTTCGC 
AACCAGTATC GCTGCAGCAA CGGGAACTGT ATCAACAGCA TTTGGTGGTG TGACTTTGAC 
AACGACTGTG GAGACATGAG CGATGAGAGA AACTGCCCTA CCACCATCTG TGACCTGGAC 
ACCCAGTTTC GTTGCCAGGA GTCTGGGACT TGTATCCCAC TGT CCTATAA ATGTGACCTT 
GAGGATGACT GTGGAGACAA CAGTGATGAA AGTCATTGTG AAATGCACCA GTGCCGGAGT 
GACGAGTACA ACTGCAGTTC CGGCATGTGC ATCCGCTCCT CCTGGGTATG TGACGGGGAC 
AACGACTGCA GGGACTGGTC TGATGAAGCC AACTGTACCG CCATCTATCA CACCTGTGAG 
GCCTCCAACT TCCAGTGCCG AAACGGGCAC TGCATCCCCC AGCGGTGGGC GTGTGACGGG 
20 GATACGGACT GCCAGGATGG TTCCGATGAG GATCCAGTCA ACTGTGAGAA GAAGTGCAAT 

GGATTCCGCT GCCCAAACGG CACTTGCATC CCATCCAGCA AACATTGTGA TGGTCTGCGT 
GATTGCTCTG ATGGCTCCGA TGAACAGCAC TGCGAGCCCC TCTGTACGCA CTTCATGGAC 
25 TTTGTGTGTA AGAACCGCCA GCAGTGCCTG TTCCACTCCA TGGTCTGTGA CGGAATCATC 

CAGTGCCGCG ACGGGTCCGA TGAGGATGCG GCGTTTGCAG GATGCTCCCA AGATCCTGAG 
TTCCACAAGG TATGTGATGA GTTCGGTTTC CAGTGTCAGA ATGGAGTGTG CATCAGTTTG 
3Q ATTTGGAAGT GCGACGGGAT GGATGATTGC GGCGATTATT CTGATGAAGC CAACTGCGAA 4080 

AACCCCACAG AAGCCCCAAA CTGCTCCCGC TACTTCCAGT TTCGGTGTGA GAATGGCCAC 4140 
TGCATCCCCA ACAGATGGAA ATGTGACAGG GAGAACGACT GTGGGGACTG GTCTGATGAG 4200 
35 AAGGATTGTG GAGATTCACA TATTCTTCCC TTCTCGACTC CTGGGCCCTC CACGTGTCTG 4260 

CCCAATTACT ACCGCTGCAG CAGTGGGACC TGCGTGATGG ACACCTGGGT GTGCGACGGG 4320 
TACCGAGATT GTGCAGATGG CTCTGACGAG GAAGCCTGCC CCTTGCTTGC AAACGTCACT 438 0 
GCTGCCTCCA CTCCCACCCA ACTTGGGCGA TGTGACCGAT TTGAGTTCGA ATGCCACCAA 444 0 

40 

CCGAAGACGT GTATTCCCAA CTGGAAGCGC TGTGACGGCC ACCAAGATTG CCAGGATGGC 4500 
CGGGACGAGG CCAATTGCCC CACACACAGC ACCTTGACTT GCATGAGCAG GGAGTTCCAG 
TGCGAGGACG GGGAGGCCTG CATTGTGCTC TCGGAGCGCT GCGACGGCTT CCTGGACTGC 
TCGGACGAGA GCGATGAAAA GGCCTGCAGT GATGAGTTGA CTGTGTACAA AGTACAGAAT 
CTTCAGTGGA CAGCTGACTT CTCTGGGGAT GTGACTTTGA CCTGGATGAG GCCCAAAAAA 
ATGCCCTCTG CATCTTGTGT ATATAATGTC TACTACAGGG TGGTTGGAGA GAGCATATGG 4 800 
AAGACTCTGG AGACCCACAG CAATAAGACA AACACTGTAT TAAAAGTCTT GAAACCAGAT 



3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 



4560 
4620 
4680 
4740 



45 



50 



4860 



ACCACGTATC AGGTTAAAGT ACAGGTTCAG TGTCTCAGCA AGGCACACAA CACCAATGAC 4 920 
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TTTGTGACCC 


TGAGGACCCC AGAGGGATTG 


r*r*fi. r* jv^vippp 
L. vJAvsA x bL 


PTfria a TATrT 

V^ X i V_ X 


v_v_A' X Vj 1 v_>V 


ft. ou 




CTCCCCAGGG 


* TV Tv /"*• T\ m/-irp/-» 71 1 1 iii i^-n-i-'TV 

AAGGAGAAGG TGTGATTGTA 


lj\jL.t-At- 1 vtVjvj 




PP A PAPPP AT 


c fl A ft 




GGCCTCJ kTCC 


GTGAGTACAT TGTAGAATAL 


A(ji—At»ijAij 1 v? 






Plft 




CAGAGGGCTG 


> •» s**mit ■» /"« 1 hi I mTV /"t TV TV TV TV TT/ - * 

CTAGTAACTT TACAGAAATC 


AAGAAL. 1 1 A 1 


TrjflTP A 7A P 7A P 


TPTATAPAPP 


^1 fift 




GTCAGAGTGG 


CTGCGGTGAC TAGTCGTvsGA 


TV T TV /~»/~« TV TV TV f^V 

A 1 ALsLjAAAV^ 1 




TAAATPPATT 


5220 


10 


ACCACCATAA 


AAGGAAAAGT GATCCCACCA 


LLAUA1 Al 


a p a ttp. n. p an 


PT A T(^P.TP4 A A 


5280 




AATTATCTAA 


GCTTCACCCT GAuC-AIGGACj 


TV PTP TV rp TV T^TA 


AwVJ x Vxnn X. 


PT ATf3T(K"5TG 

\~ X n X VJ X VJVJ X VJ 


5340 




AACCTTTTCT 


GGGCATTTGA (JALCvJALAALt 


/*" 7A 7A 7A 7A (IT* 7A 

LAALiAljAvjtjA 


rsa a ^ tt 1 vcz. a a 

unnL 111 


PTTPPP.Af^T5A 


54 00 


15 


AGCATATTGT 


CACACAAAGT TGGCAATCTG 


ACAGC7T CATA 


Un X CV- X A X \Jx\ 


n A TTTPTi^PP 
vuil 1 1L1 vjtV_.v_ 


5460 




TGGGCCAAGA 


CTGACTTGGG GGATAGCCCT 


CTGGCA 1 1 1 vj 


a np a thtt at* 

AOV_A1VjX xax 


riAPPAnAfW^ 


552 0 




GTTCGCCCAC 


CTGCACCTAG CCTCAAGGCC 


TV TV TV ^(^^TV r P/^TV 

AAAGC CAT LA 


ALLAuAL i oi- 


a nTfZTZ A A TflT 
AVJ 1 UUAA X vj X 


558 0 


20 


ACCTGGACCG 


GCCCCCGGAA TGTGGTTTAT 


GOTA 1 1 1 1 v_ 1 


aTnppapnTP 


PTTTPTTni P 
LI 1 lul 1 VJAV. 


5640 




CTCTATCGCA 


ACCCGAAGAG CTTGACTACT 


rpr^Tv r^T^/^Tv r*Tv 
TCACT C_ u\LA 


a p a a p a r , nrz r v 


P A TTflTP A P,T 

L*A 1 lul X 


5700 




AAGGATGAGC 


AGTATTTGTT TCTGGTCCGT 


Lt I AO I CjO i At 


LL l AH_ALjvj*j 


PPP ATPPTPT 
oLLri 1 V-V- X V- X 


576 0 


25 


GACTACGTTG 


TAGTGAAGAT GATCCCGtxAU 


TV (~* /"I TV / Wll*l»/-t 

AuUALivjC lit 




PPTP.PATGTG 

V— V- X VJV^fV X V7 X 


5820 


GTTCATACGG 


GCAAAACCTC CG 1 GG 1 (JA 1 L. 


AAu 1 vjovjAA 1 


pa ppnTaTna 


PT PTC CTGAC 


5880 




CAGGACTTGT 


TGTATGCAAT TGCAGTCAAA 


Vj A 1 L- 1 L1A 1 AA 


na a anaPTna 


PAGGAG CT AC 


5940 


30 


AAAGTAAAAT 


CCCGTAACAG CACTGTGGAA 


T> TV /^TV ^/^0''l w |'TV 


a p a AP.TTP.P* A. 


GCCTGGCGGG 


6000 


AAATACCACA 


TCATTGTCCA AC- 1 GvjvjviAAv, 


nmp TV rir'7A 5A TAfl 
A 1 uAuUUUiu 


ATTPPAP.PAT 


AAAAATTACC 


6060 




ACAGTTTCAT 


TATCAGCACC IbAIGLLI 1A 


AAAA1 \— rt. X nn 




T C ATGTT CTT 


6120 




CTGTTTTGGA 


AAAGCCTGGC TTTAAAGGAA 


AAuUAl 1 1 in 


a*rn a a Anp An 


GGGCTATGAG 


6180 


35 


ATACACATGT 


TTGATAGTGC CATGAATATC 




TTYZryi A A T A P 


T A PTGACAAT 


6240 




TTCTTTAAAA 


TTTCCAACCT GAAGAlvjviol 


LA1AA1 1 A\_A 


CGTTCACCGT 


CCAAGCAAGA 


6300 




TGCCTTTTTG 


GCaALLLAGAI L. 1 V? 1 Avj 




TGCTGTACGA 


TGAGCTGGGG 


6360 


40 


TCTGGTGCAG 


ATGCATCTGC AACGCAGGCT 


GCCAGATCTA 


CGGATGTTGC 


TGCTGTGGTG 






GTGCCCATCT 


TATTCCTGAT ACTGCTGAGC 


CTGGGGGTGG 


GGTTTGCCAT 


CCTGTACACG 


6480 




AAGCACCGGA 


GGCTGCAGAG CAGCTTCACC 


GCCTTCGCCA 


ACAGCCACTA 


CAGCTCCAGG 


6540 


45 


CTGGGGTCCG 


CAATCTTCTC CTCTGGGGAT 


GACCTGGGGG 


AAGATGATGA AGATGCCCCT 


6600 




ATGATAACTG 


GATTTTCAGA TGACGTCCCC 


ATGGTGATAG 


CC 




6642 




(2) INFORMATION FOR SEQ ID NO: 6 











50 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2214 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Thr Arg Ser Ser Arg Arg Glu Ser Arg Leu Pro Phe Leu Phe 
15 10 15 

Thr Leu Val Ala Leu Leu Pro Pro Gly Ala Leu Cys Glu Val Trp Thr 
20 25 3 0 

Gin Arg Leu His Gly Gly Ser Ala Pro Leu Pro Gin Asp Arg Gly Phe 

3 5 40 45 

Leu Val Val Gin Gly Asp Pro Arg Glu Leu Arg Leu Trp Ala Arg Gly 

55 SO 

Asp Ala Arg Gly Ala Ser Arg Ala Asp Glu Lys Pro Leu Arg Arg Lys 

fe 70 75 - 



80 



Arg Ser Ala Ala Leu Gin Pro Glu Pro He Lys Val Tyr Gly Gin Val 
85 90 95 

Ser Leu Asn Asp Ser His Asn Gin Met Val Val His Trp Ala Gly Glu 
100 105 110 

Lys Ser Asn Val lie Val Ala Leu Ala Arg Asp Ser Leu Ala Leu Ala 
115 120 12 5 

Arg Pro Lys Ser Ser Asp Val Tyr Val Ser Tyr Asp Tyr Gly Lys Ser 
1JU 135 140 

Phe Lys Lys He Ser Asp Lys Leu Asn Phe Gly Leu Gly Asn Arg Ser 
145 150 155 i 60 

Glu Ala Val He Ala Gin Phe Tyr His Ser Pro Ala Asp Asn Lys Arq 
165 170 175 

Tyr He Phe Ala Asp Ala Tyr Ala Gin Tyr Leu Trp He Thr Phe Asp 
180 185 190 

Phe Cys Asn Thr Leu Gin Gly Phe Ser He Pro Phe Arg Ala Ala Asp 
195 200 205 

Leu Leu Leu His Ser Lys Ala Ser Asn Leu Leu Leu Gly Phe Asp Arq 
210 215 220 

Ser His Pro Asn Lys Gin Leu Trp Lys Ser Asp Asp Phe Gly Gin Thr 
^ 25 230 235 240 

Trp He Met He Gin Glu His Val Lys Ser Phe Ser Trp Gly He Asp 
245 250 255 

Pro Tyr Asp Lys Pro Asn Thr He Tyr He Glu Arg His Glu Pro Ser 
260 265 270 

Gly Tyr Ser Thr Val Phe Arg Ser Thr Asp Phe Phe Gin Ser Arg Glu 
2? 5 280 285 



90 



EP 0 773 290 A2 



Asn Gin Glu Val lie Leu Glu Glu Val Arg Asp Phe Cln Lau Arg Acp 
290 295 300 

Lys Tyr Met Phe Ala Thr Lys Val Val His Lsu Leu Gly Ser. Glu Gin 
305 310 315 320 

Gin Ser Ser Val Gin Leu Trp Val Ser Phe Gly Arg Lys Pro Met Arg 
325 330 335 

Ala Ala Gin Phe Val Thr Arg His Pro lie Asn Glu Tyr Tyr lie Ala 
10 340 345 350 

Asp Ala Ser Glu Asp Gin Val Phe Val Cys Val Ser His Ser Asn Asn 
355 360 365 

Arg Thr Asn Leu Tyr lie Ser Glu Ala Glu Gly Leu Lys Phe Ser Leu 
15 370 375 380 

Ser Leu Glu Asn Val Leu Tyr Tyr Ser Pro Gly Gly Ala Gly Ser Asp 
385 390 395 400 



20 



25 



30 



35 



40 



45 



50 



Thr Leu Val Arg Tyr Phe Ala Asn Glu Pro Phe Ala Asp Phe His Arg 
405 410 415 

Val Glu Gly Leu Gin Gly Val Tyr lie Ala Thr Leu lie Asn Gly Ser 
420 425 430 

Met Asn Glu Glu Asn Met Arg Ser Val lie Thr Phe Asp Lys Gly Gly 
435 440 445 

Thr Trp Glu Phe Leu Gin Ala Pro Ala Phe Thr Gly Tyr Gly Glu Lys 
450 455 460 

lie Asn Cys Glu Leu Ser Gin Gly Cys Ser Leu His Leu Ala Gin Arg 
465 470 475 480 

Leu Ser Gin Leu Leu Asn Leu Gin Leu Arg Arg Met Pro lie Leu Ser 
485 490 495 

Lys Glu Ser Ala Pro Gly Leu lie lie Ala Thr Gly Ser Val Gly Lys 
500 505 510 

Asn Leu Ala Ser Lys Thr Asn Val Tyr lie Ser Ser Ser Ala Gly Ala 
515 520 525 

Arg Trp Arg Glu Ala Leu Pro Gly Pro His Tyr Tyr Thr Trp Gly Asp 
530 535 540 

His Gly Gly lie lie Thr Ala lie Ala Gin Gly Met Glu Thr Asn Glu 
545 550 555 560 

Leu Lys Tyr Ser Thr Asn Glu Gly Glu Thr Trp Lys Thr Phe lie Phe 
565 570 575 

Ser Glu Lys Pro Val Phe Val Tyr Gly Leu Leu Thr Glu Pro Gly Glu 
580 585 590 

Lys Ser Thr Val Phe Thr He Phe Gly Ser Asn Lys Glu Asn Val His 
595 600 605 

Ser Trp Leu He Leu Gin Val Asn Ala Thr Asp Ala Leu Gly Val Pro 
610 615 620 

Cys Thr Glu Asn Asp Tyr Lys Leu Trp Ser Pro Ser Asp Glu Arg Gly 
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10 



20 



625 630 635 640 

Asn Glu Cys Leu Leu Gly His Lys Thr Val Phe Lys Arg Arg Thr Pro 
645 650 . 655 

His Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val 
660 665 670 

Ser Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe 
675 680 685 

Lys Met Ser Glu Asp Leu Ser Leu Glu Val Cys Val Pro Asp Pro Glu 
690 695 700 

Phe Ser Gly Lys Ser Tyr Ser Pro Pro Val Pro Cys Pro Val Gly Ser 
705 710 715 720 

Thr Tyr Arg Arg Thr Arg Gly Tyr Arg Lys lie Ser Gly Asp Thr Cys 
725 730 735 

Ser Gly Gly Asp Val Glu Ala Arg Leu Glu Gly Glu Leu Val Pro Cys 
740 745 750 

Pro Leu Ala Glu Glu Asn Glu Phe lie Leu Tyr Ala Val Arg Lys Ser 
755 760 765 

Zle Tyr Arg Tyr Asp Leu Ala Ser Gly Ala Thr Glu Gin Leu Pro Leu 
770 775 780 

Thr Gly Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn 
785 790 795 800 

Cys Leu Tyr Trp Ser Asp Leu Ala Leu Asp Val lie Gin Arg Leu Cys 
805 810 815 

Leu Asn Gly Ser Thr Gly Gin Glu Val lie lie Asn Ser Gly Leu Glu 
820 825 830 

Thr Val Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp 
835 840 845 

Val Asp Ala Gly Phe Lys Lys lie Glu Val Ala Asn Pro Asp Gly Asp 
850 855 860 

Phe Arg Leu Thr He Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala 
865 870 875 880 

40 Leu Val Leu Val Pro Gin Glu Gly Val Met Phe Trp Thr Asp Trp Gly 

885 890 895 

Asp Leu Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala 
900 905 910 



25 



30 



35 



45 



50 



Tyr His Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly He Ser Val 
915 920 925 

Asp Asp Gin Trp He Tyr Trp Thr Asp Ala Tyr Leu Glu Cys He Glu 
930 935 940 

Arg He Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Asn Leu 
945 950 955 960 

Pro His Pro Tyr Ala He Ala Val Phe Lys Asn Glu He Tyr Trp Asp 
965 970 975 
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Asp Trp Ser Gin Leu Ser lie Phe Arg A La Ser Lys Tyi St=r Gly Ser 
980 985 990 

Gin Met Glu lie Leu Ala Asn Gin Leu Thr Gly Leu Met Asp Met Lys 
995 1000 1005 

lie Phe Tyr Lys Gly Lys Asn Thr Gly Ser Asn Ala Cys Val Pro Arg 
1010 1015 1020 

Pro Cys Ser Leu Leu Cys Leu Pro Lys Ala Asn Asn Ser Arg Ser Cys 
1025 1030 1035 1040 

Arg Cys Pro Glu Asp Val Ser Ser Ser Val Leu Pro Ser Gly Asp Leu 
1045 1050 1055 

Met Cys Asp Cys Pro Gin Gly Tyr Gin Leu Lys Asn Asn Thr Cys Val 
15 1060 1065 1070 

Lys Glu Glu Asn Thr Cys Leu Arg Asn Gin Tyr Arg Cys Ser Asn Gly 
1075 1080 1085 

Asn Cys lie Asn Ser lie Trp Trp Cys Asp Phe Asp Asn Asp Cys Gly 
20 1 0 9 0 1 0 9 5 1100 

Asp Met Ser Asp Glu Arg Asn Cys Pro Thr Thr lie Cys Asp Leu Asp 
1105 1110 1115 1120 



25 



30 



35 



40 



45 



50 



Thr Gin Phe Arg Cys Gin Glu Ser Gly Thr Cys lie Pro Leu Ser Tyr 
1125 1130 1135 

Lys Cys Asp Leu Glu Asp Asp Cys Gly Asp Asn Ser Asp Glu Ser His 
1140 1145 1150 

Cys Glu Met His Gin Cys Arg Ser Asp Glu Tyr Asn Cys Ser Ser Gly 
1155 1160 1165 

Met Cys He Arg Ser Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Arg 
1170 1175 1180 

Asp Trp Ser Asp Glu Ala Asn Cys Thr Ala He Tyr His Thr Cys Glu 
1185 1190 1195 1200 

Ala Ser Asn Phe Gin Cys Arg Asn Gly His Cys He Pro Gin Arg Trp 
1205 1210 1215 

Ala Cys Asp Gly Asp Thr Asp Cys Gin Asp Gly Ser Asp Glu Asp Pro 
1220 1225 1230 

Val Asn Cys Glu Lys Lys Cys Asn Gly Phe Arg Cys Pro Asn Gly Thr 
1235 1240 1245 

Cys He Pro Ser Ser Lys His Cys Asp Gly Leu Arg Asp Cys Ser Asp 
1250 1255 1260 

Gly Ser Asp Glu Gin His Cys Glu Pro Leu Cys Thr His Phe Met Asp 
1265 1270 1275 1280 

Phe Val Cys Lys Asn Arg Gin Gin Cys Leu Phe His Ser Met Val Cys 
1285 1290 1295 

Asp Gly He He Gin Cys Arg Asp Gly Ser Asp Glu Asp Ala Ala Phe 
1300 1305 1310 
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Ala Gly Cys Ser Gin Asp Pro Glu Phe His Lya Val-Cya Aep Clu Ph~ 
1315 1320 1325 

Gly Phe Gin Cys Gin Asn Gly Val Cys He Ser Leu He Trp Lys Cys 
1330 1335 1340 

Asp Gly Met Asp Asp Cys Gly Asp Tyr Ser Asp Glu Ala Asn Cys Glu 
1345 1350 1355 1360 

Asn Pro Thr Glu Ala Pro Asn Cys Ser Arg Tyr Phe Gin Phe Arg Cys 
1365 1370 1375 

Glu Asn Gly His Cys He Pro Asn Arg Trp Lys Cys Asp Arg Glu Asn 
1380 1385 1390 

Asp Cys Gly Asp Trp Ser Asp Glu Lys Asp Cys Gly Asp Ser His He 
1395 1400 1405 

Leu Pro Phe Ser Thr Pro Gly Pro Ser Thr Cys Leu Pro Asn Tyr Tyr 
1410 1415 1420 

Arg Cys Ser Ser Gly Thr Cys Val Met Asp Thr Trp Val Cys Asp Gly 
1425 1430 1435 1440 

Tyr Arg Asp Cys Ala Asp Gly Ser Asp Glu Glu Ala Cys Pro Leu Leu 
1445 1450 1455 

Ala Asn Val Thr Ala Ala Ser Thr Pro Thr Gin Leu Gly Arg Cys Asp 
1460 1465 1470 

Arg Phe Glu Phe Glu Cys His Gin Pro Lys Thr Cys He Pro Asn Trp 
1475 1480 1485 

Lys Arg Cys Asp Gly His Gin Asp Cys Gin Asp Gly Arg Asp Glu Ala 
1490 1495 1500 

Asn Cys Pro Thr His Ser Thr Leu Thr Cys Met Ser Arg Glu Phe Gin 
1505 1510 1515 1520 

Cys Glu Asp Gly Glu Ala Cys He Val Leu Ser Glu Arg Cys Asp Gly 
1525 1530 1535 

Phe Leu Asp Cys Ser Asp Glu Ser Asp Glu Lys Ala Cys Ser Asp Glu 
1540 1545 1550 

Leu Thr Val Tyr Lys Val Gin Asn Leu Gin Trp Thr Ala Asp Phe Ser 
1555 1560 1565 

Gly Asp Val Thr Leu Thr Trp Met Arg Pro Lys Lys Met Pro Ser Ala 
1570 1575 1580 

Ser Cys Val Tyr Asn Val Tyr Tyr Arg Val Val Gly Glu Ser He Trp 
1585 1590 1595 1600 

Lys Thr Leu Glu Thr His Ser Asn Lys Thr Asn Thr Val Leu Lys Val 
1605 1610 1615 

Leu Lys Pro Asp Thr Thr Tyr Gin Val Lys Val Gin Val Gin Cys Leu 
1620 1625 1630 

Ser Lys Ala His Asn Thr Asn Asp Phe Val Thr Leu Arg Thr Pro Glu 
1635 1640 1645 

Gly Leu Pro Asp Ala Pro Arg Asn Leu Gin Leu Ser Leu Pro Arg Glu 
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1650 



1655 



Ala Glu 
1665 



Gly Val lie Val Gly 
1670 



His Trp Ala Pro Pro 
1675 



He His Thr His 

16:i0 



Gly Leu He Arg Glu Tyr He Val Glu Tyr Ser Arg Ser Gly Ser Lys 
1685 1690 1695 

Met Trp Ala Ser Gin Arg Ala Ala Ser Asn Phe Thr Glu He Lys Asn 
1700 1705 1710 

Leu Leu Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser 
1715 1720 1725 

Arg Gly He Gly Asn Trp Ser Asp Ser Lys Ser He Thr Thr He Lys 
1730 1735 1740 

Gly Lys Val He Pro Pro Pro Asp He His He Asp Ser Tyr Gly Glu 
1745 1750 1755 176C 

Asn Tyr Leu Ser Phe Thr Leu Thr Met Glu Ser Asp He Lys Val Asn 
1765 1770 1775 

Gly Tyr Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu 
1780 1785 1790 

Arg Arg Thr Leu Asn Phe Arg Gly Ser He Leu Ser His Lys Val Gly 
1795 1800 1805 

Asn Leu Thr Ala His Thr Ser Tyr Glu He Ser Ala Trp Ala Lys Thr 
1810 1815 1820 

Asp Leu Gly Asp Ser Pro Leu Ala Phe Glu His Val Met Thr Arg Gly 
1825 1830 1835 184C 

Val Arg Pro Pro Ala Pro Ser Leu Lys Ala Lys Ala He Asn Gin Thr 
1845 1850 1855 

Ala Val Glu Cys Thr Trp Thr Gly Pro Arg Asn Val Val Tyr Gly He 
1860 1865 1870 . 

Phe Tyr Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Lys Ser Leu 
1875 1880 1885 

Thr Thr Ser Leu His Asn Lys Thr Val He Val Ser Lys Asp Glu Gin 
1890 1895 1900 

Tyr Leu Phe Leu Val Arg Val Val Val Pro Tyr Gin Gly Pro Ser Ser 
1905 1910 1915 192< 

Asp Tyr Val Val Val Lys Met He Pro Asp Ser Arg Leu Pro Pro Arg 
1925 1930 1935 

His Leu His Val Val His Thr Gly Lys Thr Ser Val Val He Lys Trp 
1940 1945 1950 

Glu Ser Pro Tyr Asp Ser Pro Asp Gin Asp Leu Leu Tyr Ala He Ala 
1955 1960 1965 

Val Lys Asp Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser 
1970 1975 1980 

Arg Asn Ser Thr Val Glu Tyr Thr Leu Asn Lys Leu Glu Pro Gly Gly 



1985 



1990 



1995 



2000 
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Lys Tyr His lie He Val Gin Leu Gj.y Asn Met Ser Lye Asp S*»r Ser 
2005 2010 2015 

5 He Lys He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys He 

2020 2025 2030 

He Thr Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu 
2035 2040 2045 

70 L Y S Glu L YS Hi s Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe 

2050 2055 2060 

Asp Ser Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn 
2065 2070 2075 2080 



15 



25 



30 



35 



40 



45 



50 



Phe Phe Lys He Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr 
2085 2090 2095 

Val Gin Ala Arg Cys Leu Phe Gly Asn Gin He Cys Gly Glu Pro Ala 
2100 2105 2110 

He Leu Leu Tyr Asp Glu Leu Gly Ser Gly Ala Asp Ala Ser Ala Thr 
2115 2120 2125 

Gin Ala Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Pro He Leu 
2130 2135 2140 

Phe Leu He Leu Leu Ser Leu Gly Val Gly Phe Ala He Leu Tyr Thr 
2145 2150 2155 2160 

Lys His Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His 
2165 2170 2175 

Tyr Ser Ser Arg Leu Gly Ser Ala He Phe Ser Ser Gly Asp Asp Leu 
2180 2185 2190 

Gly Glu Asp Asp Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp 
2195 2200 2205 

Val Pro Met Val He Ala 
2210 

(2) INFORMATION FOR SEQ ID NO : 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6843 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(ix) FEATURE: 

(A) NAME /KEY : sig peptide 

(B) LOCATION: 81. .164 

(C) IDENTIFICATION METHOD: S 

(ix) FEATURE: 

(A) NAME /KEY : mat peptide 

(B) LOCATION: 165. .6722 

(C) IDENTIFICATION METHOD: S 
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(xi) SEQUENCE DESCRIPTION: SEQ ZD NO: 7: 

CCG GCCCAGCGGC TCTCCTGGCC 23 
TCGCGCTGCA CATTCTCTCC TGGCGGCGGC GCCACCTGCA GTAGCGTTCG CCCGAACATG 83 

Met 
1 

131 
179 
227 
275 
323 
371 
419 
467 
515 
563 
611 
659 
707 
755 
803 
851 
899 
947 
995 



GCG 


ACA 


CGG 


AGC 


AGC 


AGG 


AGG 


GAG 


TCG 


CGA 


CTC 


CCG 


TTC 


CTA 


TTC 


ACC 


Ala 


Thr 


Arg 


Ser 
5 


Ser 


Arg 


Arg 


Glu 


Ser 
10 


Arg 


Leu 


Pro 


Phe 


Leu 


Phe 


Thr 


CTG 


GTC 


GCA 


CTG 


CTG 


CCG 


CCC 


GGA 


GCT 


CTC 


TGC 


GAA 


GTC 


15 
TGG 


ACG 


CAG 


Leu 


Val 


Ala 


Leu 


Leu 


Pro 


Pro 


Gly Ala 


Leu 


Cys 


Glu 


Val 


Trp 


Thr 


Gin 






20 










25 










30 






AGG 


CTG 


CAC 


GGC 


GGC 


AGC 


GCG 


CCC 


TTG 


CCC 


CAG 


GAC 


CGG 


GGC 


TTC 


CTC 


Arg 


Leu 


His 


Gly 


Gly 


Ser 


Ala 


Pro 


Leu 


Pro 


Gin 


Asp 


Arg 


Gly 


Phe 


Leu 




35 










40 










45 






GTG 


GTG 


CAG 


GGC 


GAC 


CCG 


CGC 


GAG 


CTG 


CGG 


CTG 


TGG 


GCG 


CGC 


GGG 


GAT 


Val 


Val 


Gin Gly 


Asp 


Pro 


Arg 


Glu 


Leu 


Arg 


Leu 


Trp 


Ala 


Arg 


Gly 


As P 


50 










55 










60 










65 


GCC 


AGG 


GGG 


GCG 


AGC 


CGC 


GCG 


GAC 


GAG 


AAG 


CCG 


CTC 


CGG 


AGG 


AAA 


CGG 


Ala 


Arg 


Gly Ala 


Ser Arg Ala Asp Glu 


Lys 


Pro 


Leu 


Arg 


Arg 


Lys 


Arg 










70 










75 










80 


AGC 


GCT 


GCC 


CTG 


CAG 


CCC 


GAG 


CCC 


ATC 


AAG 


GTG 


TAC 


GGA 


CAG 


GTT 


AGT 


Ser 


Ala 


Ala 


Leu 


Gin 


Pro 


Glu 


Pro 


He 


Lys 


Val 


Tyr 


Gly 


Gin 


Val 


Ser 








85 










90 






95 






CTG 


AAT 


GAT 


TCC 


CAC 


AAT 


CAG 


ATG 


GTG 


GTG 


CAC 


TGG 


GCT 


GGA 


GAG 


AAA 


Leu 


Asn 


Asp 


Ser 


His 


Asn 


Gin 


Met 


Val 


Val 


His 


Trp 


Ala 


Gly 


Glu 


Lys 






100 










105 










110 




AGC 


AAC 


GTG 


ATC 


GTG 


GCC 


TTG 


GCC 


CGA 


GAT 


AGC 


CTG 


GCA 


TTG 


GCG 


AGG 


Ser 


Asn 


Val 


He 


Val 


Ala 


Leu 


Ala 


Arg 


Asp 


Ser 


Leu 


Ala 


Leu 


Ala 


Arg 




115 










120 










125 












AGC 


AGT 


GAT 


GTG 


TAC 


GTG 


TCT 


TAC 


GAC 


TAT 


GGA 


AAA 


TCA 


TTC 


Pro 


Lys 


Ser 


Ser 


Asp 


Val 


Tyr 


Val 


Ser 


Tyr 


Asp 


Tyr 


Gly 


Lys 


Ser 


Phe 


130 










135 










140 








145 


7k Ti. n 
AAVJT 


& a Zt 
AAA 


ATT 


TCA 


GAC 


AAG 


TTA 


AAC 


TTT 


GGC 


TTG 


GGA 


AAT 


AGG 


AGT 


GAA 


Lys 


Lys 


lie 


Ser 


Asp 


Lys 


Leu 


Asn 


Phe 


Gly 


Leu 


Gly 


Asn 


Arg 


Ser 


Glu 










150 










155 










160 




GCT 


GTT 


ATC 


GCC 


CAG 


TTC 


TAC 


CAC 


AGC 


CCT 


GCG 


GAC 


AAC 


AAG 


CGG 


TAC 


Ala 


Val 


He 


Ala 


Gin 


Phe 


Tyr 


His 


Ser 


Pro 


Ala 


Asp 


Asn 


Lys 


Arg 


Tyr 








165 










170 










175 




ATv- 


TTT 


GCA 


GAC 


GCT 


TAT 


GCC 


CAG 


TAC 


CTC 


TGG 


ATC 


ACG 


TTT 


GAC 


TTC 


lie 


Pne 


Ala 


Asp 


Ala 


Tyr Ala 


Gin 


Tyr 


Leu 


Trp 


He 


Thr 


Phe 


Asp 


Phe 






180 










185 










190 






TGC 


AAC 


ACT 


CTT 


CAA 


GGC 


TTT 


TCC 


ATC 


CCA 


TTT 


CGG 


GCA 


GCT 


GAT 


CTC 


Cys 


Asn 


Thr 


Leu 


Gin 


Gly 


Phe 


Ser 


He 


Pro 


Phe 


Arg 


Ala 


Ala 


Asp 


Leu 




195 










200 










205 










l_ 1 L. 


CTA 


CAC 


AGT 


AAG 


GCC 


TCC 


AAC 


CTT 


CTC 


TTG 


GGC 


TTT 


GAC 


AGG 


TCC 


Leu 


Leu 


His 


Ser 


Lys 


Ala 


Ser 


Asn 


Leu 


Leu 


Leu 


Gly 


Phe 


Asp 


Arg 


Ser 


210 










215 










220 










225 


CAC 


CCC 


AAC 


AAG 


CAG 


CTG 


TGG 


AAG 


TCA 


GAT 


GAC 


TTT 


GGC 


CAG 


ACC 


TGG 


His 


Pro 


Asn 


Lys 


Gin 


Leu 


Trp 


Lys 


Ser 


Asp 


Asp 


Phe 


Gly 


Gin 


Thr 


Trp 










230 










235 










240 




ATC 


ATG 


ATT 


CAG 


GAA 


CAT 


GTC 


AAG 


TCC 


TTT 


TCT 


TGG 


GGA 


ATT 


GAT 


CCC 


lie 


Met 


He 


Gin 


Glu 


His 


Val 


Lys 


Ser 


Phe 


Ser 


Trp 


Gly 


He 


Asp 


Pro 








245 










250 










255 






TAT 


GAC 


AAA 


CCA 


AAT 


ACC 


ATC 


TAC 


ATT 


GAA 


CGA 


CAC 


GAA 


CCC 


TCT 


GGC 


Tyr 


Asp 


Lys 


Pro 


Asn 


Thr 


He 


Tyr 


He 


Glu 


Arg 


His 


Glu 


Pro 


Ser 


Gly 






260 










265 










270 








TAC 


TCC 


ACT 


GTC 


TTC 


CGA 


AGT 


ACA 


GAT 


TTC 


TTC 


CAG 


TCC 


CGG 


GAA 


AAC 


Tyr 


Ser 


Thr 


Val 


Phe 


Arg 


Ser 


Thr 


Asp 


Phe 


Phe 


Gin 


Ser 


Arg 


Glu 


Asn 




275 










280 










285 










CAG 


GAA 


GTG 


ATC 


CTT 


GAG 


GAA 


GTG 


AGA 


GAT 


TTT 


CAG 


CTT 


CGG 


GAC 


AAG 


Gin 


Glu 


Val 


He 


Leu 


Glu 


Glu 


Val 


Arg 


Asp 


Phe 


Gin 


Leu 


Arg 


Asp 


Lys 


290 










295 










300 










305 



55 



97 

BNSDOC1D: <EP 0773290A2_L> 



\ 

I 



EP 0 773 290 A2 



w 



15 



20 



25 



30 



35 



40 



TAC 


ATG 


TTT 


GCT 


ACA 


AAG 


GTG 


GTG 


CAT 


CTC 


TTG 


GGC 


ACT SAA 


CAC 


CAG 


Tyr 


Met 


Phe 


Ala 


Thr 


Lys 


Val 


Val 


His 


Leu 


Leu 


Gly Ser Glu 


Gl-n 


Gin 










310 










315 








320 




TCT 


TCT 


GTC 


CAG 


CTC 


TGG 


GTC 


TCC 


TTT 


GGC 


CGG 


AAG 


CCC ATG 


AGA- 


GCA 


Ser 


Ser 


Val 


Gin 


Leu 


Trp 


Val 


Ser 


Phe 


Gly 


Arg 


Lys 


Pro Met 


Arg 


Ala 








325 










330 








335 




GCC 


CAG 


TTT 


GTC 


ACA 


AGA 


CAT 


CCT 


ATT 


AAT 


GAA 


TAT 


TAC ATC 


GCA 


GAT 


Ala 


Gin 


Phe 


Val 


Thr Arg His 


Pro 


He 


Asn 


Glu Tyr Tyr He Ala Asp 






340 










345 










3 50 






GCC 


TCC 


GAG 


GAC 


CAG 


GTG 


TTT 


GTG 


TGT 


GTC 


AGC 


CAC 


AGT AAC 


AAC 


CGC 


Ala 


Ser 


Glu 


Asp 


Gin 


Val 


Phe 


Val 


Cvs 


Val 


Ser 


His 


Ser Asn 


Asn 


Arg 




355 








360 








365 






ACC 


AAT 


TTA 


TAC 


ATC 


TCA 


GAG 


GCA 


GAG 


GGG 


CTG 


AAG 


TTC TCC 


CTG 


TCC 


Thr 


Asn 


Leu 


Tyr 


He 


Ser 


Glu 


Ala 


Glu 


Gly 


Leu 


Lys 


Phe Ser 


Leu 


Ser 


370 










375 










380 








385 


TTG 


GAG 


AAC 


GTG 


CTC 


TAT 


TAC 


AGC 


CCA 


GGA 


GGG 


GCC 


GGC AGT 


GAC 


ACC 


LeU 


Glu 


Asn 


Val 


Leu 


Tyr 


Tyr 


Ser 


Pro 


Gly 


Gly Ala Gly Ser Asp 


Thr 










390 










395 








400 




TTG 


GTG 


AGG 


TAT 


TTT 


GCA 


AAT 


GAA 


CCA 


TTT 


GCT 


GAC 


TTC CAC 


CGA 


GTG 


Leu 


Val 


Arci 


Tyr 


Phe 


Ala 


Asn 


Glu 


Pro 


Phe 


Ala 


Asp 


Phe His 


Arg 


Val 








405 










410 








415 






GAA 


GGA 


TTG 


CAA 


GGA 


GTC 


TAC 


ATT 


GCT 


ACT 


CTG 


ATT 


AAT GGT 


TCT 


ATG 


Glu 


Glv 


Leu 


Gin Gly Val 


Tyr 


He 


Ala 


Thr 


Leu 


He Asn Gly Ser Met 






420 










425 










430 






AAT 


GAG 


GAG 


AAC 


ATG 


AGA 


TCG 


GTC 


ATC 


ACC 


TTT 


GAC 


AAA GGG 


GGA 


ACC 


Asn 


Glu 


Glu 


Asn 


Met 


Arg 


Ser 


Val 


He 


Thr 


Phe 


Asp 
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Glu Cys Leu Leu Gly His Lys Thr Val Phe Ly* Arg Ara Thr Pro His 

I GC ^ C 00/1 GAG GAC ^ ^ AGG CCG GTG GTC GTG TCC 2099 

Ala Thr Cys Phe Asn Gly Glu Asp Phe Asp Arg Pro Val Val Val Ser 

660 665 670 

AAC TGC TCC TGC ACC CGG GAG GAC TAT GAG TGT GAC TTC GGT TTC AAG 214 7 
Asn Cys Ser Cys Thr Arg Glu Asp Tyr Glu Cys Asp Phe Gly Phe Lys 

675 680 685 

ATG AGT GAA GAT TTG TCA TTA GAG GTT TGT GTT CCA GAT CCG GAA TTT 2195 
Met Ser Glu Asp Leu Ser Leu Glu Val Cys Val P ro Asp Pro Glu Phe 
10 690 695 700 705 

TCT GGA AAG TCA TAC TCC CCT CCT GTG CCT TGC CCT GTG GGT TCT ACT 2243 
Ser Gly Lys Ser Tyr Ser Pro Pro Val Pro Cys Pro Val Gly Ser Thr 

710 715 720 

TAC AGG AGA ACG AGA GGC TAC CGG AAG ATT TCT GGG GAC ACT TGT AGC 2291 
Tyr Arg Arg Thr Arg Gly Tyr Arg Lys He Ser Gly Asp Thr Cys Ser 
15 725 730 735 

GGA GGA GAT GTT GAA GCG CGA CTG GAA GGA GAG CTG GTC CCC TGT CCC 233 9 
Gly Gly Asp Val Glu Ala Arg Leu Glu Gly Glu Leu Val Pro Cys Pro 

740 745 750 

CTG GCA GAA GAG AAC GAG TTC ATT CTG TAT GCT GTG AGG AAA TCC ATC 2387 
Leu Ala Glu Glu Asn Glu Phe He Leu Tyr Ala Val Arg Lys Ser He 
20 7 5 5 7 6 0 7 6 5 

TAC CGC TAT GAC CTG GCC TCG GGA GCC ACC GAG CAG TTG CCT CTC ACC 2435 
Tyr Arg Tyr Asp Leu Ala Ser Gly Ala Thr Glu Gin Leu Pro Leu Thr 
770 775 780 785 

GGG CTA CGG GCA GCA GTG GCC CTG GAC TTT GAC TAT GAG CAC AAC TGT 2483 
Gly Leu Arg Ala Ala Val Ala Leu Asp Phe Asp Tyr Glu His Asn Cys 
P5 790 795 800 

TTG TAT TGG TCC GAC CTG GCC TTG GAC GTC ATC CAG CGC CTC TGT TTG 2 531 
Leu Tyr Trp Ser Asp Leu Ala Leu Asp Val He Gin Arg Leu Cys Leu 

805 810 815 

AAT GGA AGC ACA GGG CAA GAG GTG ATC ATC AAT TCT GGC CTG GAG ACA 2 579 
Asn Gly Ser Thr Gly Gin Glu Val He He Asn Ser Gly Leu Glu Thr 

820 825 830 

GTA GAA GCT TTG GCT TTT GAA CCC CTC AGC CAG CTG CTT TAC TGG GTA 2627 
Val Glu Ala Leu Ala Phe Glu Pro Leu Ser Gin Leu Leu Tyr Trp Val 

835 840 845 

GAT GCA GGC TTC AAA AAG ATT GAG GTA GCT AAT CCA GAT GGC GAC TTC 2675 
Asp Ala Gly Phe Lys Lys He Glu Val Ala Asn Pro Asp Gly Asp Phe 
850 855 860 865 

35 CGA CTC ACA ATC GTC AAT TCC TCT GTG CTT GAT CGT CCC AGG GCT CTG 2723 

Arg Leu Thr He Val Asn Ser Ser Val Leu Asp Arg Pro Arg Ala Leu 

870 875 880 

GTC CTC GTG CCC CAA GAG GGG GTG ATG TTC TGG ACA GAC TGG GGA GAC 2771 
Val Leu Val Pro Gin Glu Gly Val Met Phe Trp Thr Asp Trp Gly Asp 

885 890 895 

CTG AAG CCT GGG ATT TAT CGG AGC AAT ATG GAT GGT TCT GCT GCC TAT 2819 
Leu Lys Pro Gly He Tyr Arg Ser Asn Met Asp Gly Ser Ala Ala Tyr 

900 905 910 

CAC CTG GTG TCT GAG GAT GTG AAG TGG CCC AAT GGC ATC TCT GTG GAC 286 7 
His Leu Val Ser Glu Asp Val Lys Trp Pro Asn Gly He Ser Val Asp 

915 920 925 

GAC CAG TGG ATT TAC TGG ACG GAT GCC TAC CTG GAG TGC ATA GAG CGG 2915 
Asp Gin Trp He Tyr Trp Thr Asp Ala Tyr Leu Glu Cys He Glu Arq 
930 935 940 945 

ATC ACG TTC AGT GGC CAG CAG CGC TCT GTC ATT CTG GAC AAC CTC CCG 2 963 
He Thr Phe Ser Gly Gin Gin Arg Ser Val He Leu Asp Asn Leu Pro 

950 955 9 60 

CAC CCC TAT GCC ATT GCT GTC TTT AAG AAT GAA ATC TAC TGG GAT GAC 3 011 
50 Hls Pro Tyr Ala He Ala Val Phe Lys Asn Glu He Tyr Trp Asp Asp 

965 970 975 

TGG TCA CAG CTC AGC ATA TTC CGA GCT TCC AAA TAC AGT GGG TCC CAG 3059 
Trp Ser Gin Leu Ser He Phe Arg Ala Ser Lys Tyr Ser Gly Ser Gin 
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10 TGT CCA GAG GAT GTG TCC AGC AGT GTG CTT CCA TCA GGG GAC CTG ATG 3251 

3299 

75 GAA GAG AAC ACC TGT CTT CGC AAC CAG TAT CGC TGC AGC AAC GGG AAC 3347 
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Gin 


Asp Gly 


Ser 


Asp 


Glu 


Asp 


Pro 


Val 




1220 








1225 








1230 






AAC 


TGT 


GAG 


AAG 


AAG 


TGC 


AAT 


GGA 


TTC 


CGC 


TGC 


CCA 


AAC 


GGC 


ACT 


TGC 


Asn 


Cys 


Glu 


Lys 


Lys 


Cys 


Asn 


Gly 


Phe 


Arg 


Cys 


Pro 


Asn 


Gly 


Thr 


Cys 




1235 






1240 








1245 








ATC 


CCA 


TCC 


AGC 


AAA 


CAT 


TGT 


GAT 


GGT 


CTG 


CGT 


GAT 


TGC 


TCT 


GAT 


GGC 


lie 


Pro 


Ser 


Ser 


Lys 


His 


Cys 


Asp 


Gly 


Leu 


Arg Asp 


Cys 


Ser 


Asp 


Gly 


1250 








1255 








1260 








1265 


TCC 


GAT 


GAA 


CAG 


CAC 


TGC 


GAG 


CCC 


CTC 


TGT 


ACG 


CAC 


TTC 


ATG 


GAC 


TTT 


Ser 


Asp 


Glu 


Gin 


His 


Cys 


Glu 


Pro 


Leu 


Cys 


Thr 


His 


Phe 


Met 


Asp 


Phe 








1270 








1275 








1280 


GTG 


TGT 


AAG 


AAC 


CGC 


CAG 


CAG 


TGC 


CTG 


TTC 


CAC 


TCC 


ATG 


GTC 


TGT 


GAC 


Val 


Cys 


Lys 


Asn 


Arg 


Gin 


Gin 


Cys 


Leu 


Phe 


His 


Ser 


Met 


Val 


Cys 


Asp 




1285 








1290 








1295 




GGA 


ATC 


ATC 


CAG 


TGC 


CGC 


GAC 


GGG 


TCC 


GAT 


GAG 


GAT 


GCG 


GCG 


TTT 


GCA 


Gly 


He 


He 


Gin 


Cys Arg Asp 


Gly 


Ser Asp 


Glu 


Asp 


Ala 


Ala 


Phe 


Ala 




1300 








1305 








1310 






GGA 


TGC 


TCC 


CAA 


GAT 


CCT 


GAG 


TTC 


CAC 


AAG 


GTA 


TGT 


GAT 


GAG 


TTC 


GGT 


Gly 


Cys 


Ser 


Gin 


Asp 


Pro 


Glu 


Phe 


His 


Lys 


Val 


Cys 


Asp 


Glu 


Phe 


Gly 


1315 








1320 








1325 









3155 



3203 



3443 



3491 



3539 



3587 



3635 



3683 



3731 



3779 



3827 



3875 



3923 



3971 



4019 



4067 
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15 



20 



25 



30 



35 



TTC 


CAG 


TGT 


CAG 


AAT 


GGA 


GTG 


TGC 


ATC 


AGT 


IT? 


ATT 


TOG 


AAQ 


TGC 


GAC 


Phe 


Gin 


Cys 


Gin Asn Gly 


Val 


Cys 


He 


Ser 


Leu 


lie 


Trp 


Lys 


Cys 


Afip 


1330 










1335 








1340 








1345 


GGG 


ATG 


GAT 


GAT 


TGC 


GGC 


GAT 


TAT 


TCT 


GAT 


GAA 


GCC 


AAC 


TGC 


GAA 


AAC 


Gly Met 


Asp 


Asp 


Cys 


Gly 


Asp 


Tyr 


Ser Asp 


Glu 


Ala 


Asn 


Cys 


Glu 


Asn 










1350 








1355 








1360 


CCC 


ACA 


GAA 


GCC 


CCA 


AAC 


TGC 


TCC 


CGC 


TAC 


TTC 


CAG 


TTT 


CGG 


TGT 


GAG 


Pro 


Thr 


Glu 


Ala 


Pro 


Asn 


Cys 


Ser 


Arg 


Tyr 


Phe 


Gin 


Phe 


Arg 


Cys 


Glu 








1365 








1370 








1375 




AAT 


GGC 


CAC 


TGC 


ATC 


CCC 


AAC 


AGA 


TGG 


AAA 


TGT 


GAC 


AGG 


GAG 


AAC 


GAC 


Asn Gly 


His 


Cys 


He 


Pro 


Asn 


Arg 


Trp 


Lys 


Cys 


Asp 


Arg 


Glu 


Asn 


Asp 






1380 








1385 








1390 






TGT 


GGG 


GAC 


TGG 


TCT 


GAT 


GAG 


AAG 


GAT 


TGT 


GGA 


GAT 


TCA 


CAT 


ATT 


CTT 


Cys 


Gly 


Asp 


Trp 


Ser 


Asp 


Glu 


Lys 


Asp 


Cys 


Gly 


Asp 


Ser 


His 


He 


Leu 


1395 








1400 








1405 








CCC 


TTC 


TCG 


ACT 


CCT 


GGG 


CCC 


TCC 


ACG 


TGT 


CTG 


CCC 


AAT 


TAC 


TAC 


CGC 


Pro 


Phe 


Ser 


Thr 


Pro 


Gly 


Pro 


Ser 


Thr 


Cys 


Leu 


Pro 


Asn 


Tyr 


Tyr 


Arg 


1410 








1415 








1420 








1425 


TGC 


AGC 


AGT 


GGG 


ACC 


TGC 


GTG 


ATG 


GAC 


ACC 


TGG 


GTG 


TGC 


GAC 


GGG 


TAC 


Cys 


Ser 


Ser 


Gly 


Thr 


Cys 

) 


Val 


Met 


Asp 


Thr 


Trp 


Val 


Cys 


Asp 


Gly 


Tyr 






143C 








1435 








1440 


CGA 


GAT 


TGT 


GCA 


GAT 


GGC 


TCT 


GAC 


GAG 


GAA 


GCC 


TGC 


CCC 


TTG 


CTT 


GCA 


Arg 


Asp 


Cys 


Ala 


Asp Gly 


Ser 


Asp 


Glu 


Glu 


Ala 


Cys 


Pro 


Leu 


Leu 


Ala 








1445 








1450 








1455 




AAC 


GTC 


ACT 


GCT 


GCC 


TCC 


ACT 


CCC 


ACC 


CAA 


CTT 


GGG 


CGA 


TGT 


GAC 


CGA 


Asn 


Val 


Thr 


Ala 


Ala 


Ser 


Thr 


Pro 


Thr 


Gin 


Leu 


Gly 


Arg 


Cys 


Asp 


Arg 






1460 








1465 








1470 






TTT 


GAG 


TTC 


GAA 


TGC 


CAC 


CAA 


CCG 


AAG 


ACG 


TGT 


ATT 


CCC 


AAC 


TGG 


AAG 


Phe 


Glu 


Phe 


Glu 


Cys 


His 


Gin 


Pro 


Lys 


Thr 


Cys 


He 


Pro 


Asn Trp 


Lys 




1475 






1480 








1485 








CGC 


TGT 


GAC 


GGC 


CAC 


CAA 


GAT 


TGC 


CAG 


GAT 


GGC 


CGG 


GAC 


GAG 


GCC 


AAT 


Arg 


Cys 

) 


Asp 


Gly 


His 


Gin 


Asp 


Cys 


Gin 


Asp 


Gly 


Arg 


Asp 


Glu 


Ala 


Asn 


149C 






1495 








1500 








1505 


TGC 


CCC 


ACA 


CAC 


AGC 


ACC 


TTG 


ACT 


TGC 


ATG 


AGC 


AGG 


GAG 


TTC 


CAG 


TGC 


Cys 


Pro 


Thr 


His 


Ser 


Thr 


Leu 


Thr 


Cys 


Met 


Ser 


Arg 


Glu 


Phe 


Gin 


Cys 








1510 








1515 








1520 


GAG 


GAC 


GGG 


GAG 


GCC 


TGC 


ATT 


GTG 


CTC 


TCG 


GAG 


CGC 


TGC 


GAC 


GGC 


TTC 


Glu 


Asp 


Gly 


Glu 


Ala 


Cys 


He 


val 


Leu 


Ser 


Glu 


Arg 


Cys 


Asp Gly 


Phe 








1525 








1530 








1535 




CTG 


GAC 


TGC 


TCG 


GAC 


GAG 


AGC 


GAT 


GAA 


AAG 


GCC 


TGC 


AGT 


GAT 


GAG 


TTG 


Leu 


Asp 


Cys 


Ser 


Asp 


Glu 


Ser 


Asp 


Glu 


Lys 


Ala 


Cys 


Ser 


Asp 


Glu 


Leu 




1540 








1545 








1550 






ACT 


GTG 


TAC 


AAA 


GTA 


CAG 


AAT 


CTT 


CAG 


TGG 


ACA 


GCT 


GAC 


TTC 


TCT 


GGG 


Thr 


Val 


Tyr 


Lys 


Val 


Gin 


Asn 


Leu 


Gin 


Trp 


Thr 


Ala 


Asp 


Phe 


Ser 


Gly 




1555 








1560 








1565 








GAT 


GTG 


ACT 


TTG 


ACC 


TGG 


ATG 


AGG 


CCC 


AAA 


AAA 


ATG 


CCC 


TCT 


GCA 


TCT 


Asp 


Val 


Thr 


Leu 


Thr 


Trp 


Met 


Arg 


Pro 


Lys 


Lys 


Met 


Pro 


Ser 


Ala 


Ser 


1570 








1575 








1580 








1585 


TGT 


GTA 


TAT 


AAT 


GTC 


TAC 


TAC 


AGG 


GTG 


GTT 


GGA 


GAG 


AGC 


ATA 


TGG 


AAG 


Cys 


Val 


Tyr 


Asn 


Val 


Tyr 


Tyr Arg 


Val 


Val 


Gly 


Glu 


Ser 


He 


Trp 


Lys 






1590 








1595 








1600 


ACT 


CTG 


GAG 


ACC 


CAC 


AGC 


AAT 


AAG 


ACA 


AAC 


ACT 


GTA 


TTA 


AAA 


GTC 


TTG 


Thr 


Leu 


Glu 


Thr 


His 


Ser 


Asn 


Lys 


Thr 


Asn 


Thr 


Val 


Leu 


Lys 


Val 


Leu 








1605 








1610 








1615 




AAA 


CCA 


GAT 


ACC 


ACG 


TAT 


CAG 


GTT 


AAA 


GTA 


CAG 


GTT 


CAG 


TGT 


CTC 


AGC 


Lys 


Pro 


Asp 


Thr 


Thr 


Tyr 


Gin 


Val 


Lys 


Val 


Gin 


Val 


Gin 


Cys 


Leu 


Ser 




1620 






1625 








1630 






AAG 


GCA 


CAC 


AAC 


ACC 


AAT 


GAC 


TTT 


GTG 


ACC 


CTG 


AGG 


ACC 


CCA 


GAG 


GGA 


Lys 


Ala 


His 


Asn 


Thr 


Asn Asp 


Phe 


Val 


Thr 


Leu 


Arg 


Thr 


Pro 


Glu 


Gly 


1635 








1640 








1645 








TTG 


CCA 


GAT 


GCC 


CCT 


CGA 


AAT 


CTC 


CAG 


CTG 


TCA 


CTC 


CCC 


AGG 


GAA 


GCA 


Leu 


Pro 


Asp 


Ala 


Pro 


Arg 


Asn 


Leu 


Gin 


Leu 


Ser 


Leu 


Pro Arg 


Glu 


Ala 


1650 






1655 








1660 








1665 


GAA 


GGT 


GTG 


ATT 


GTA 


GGC 


CAC 


TGG 


GCT 


CCT 


CCC 


ATC 


CAC 


ACC 


CAT 


GGC 



4115 



4163 



4211 



4259 



4307 



4355 



4403 



4451 



4499 



4547 



4595 



4643 



4691 



4739 



4787 



4835 

40 TGT GTA TAT AAT GTC TAC TAC AGG GTG GTT GGA GAG AGC ATA TGG AAG 4 883 

4931 

45 AAA CCA RAT ACcTaCG TAT CAG GTT AAA GTA CAG GTT CAG TGT CTC AGC 4 979 

5027 

50 TTG CCA GAT GCC CCT CGA AAT CTC CAG CTG TCA CTC CCC AGG GAA GCA 5075 

5123 
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Glu Gly Val He Val Gly His Trp Ala Pre r>ro He His Thr His Glv 

1670 1675 16C0 

CTC ATC CGT GAG TAC ATT GTA GAA TAC AGC AGG AGT GGT TCC AAG ATG 5171 
Leu He Arg Glu Tyr He Val Glu Tyr Ser Arg Ser Gly Ser Lys Met 
5 1685 1690 1695 ' 

TGG GCC TCC CAG AGG GCT GCT AGT AAC TTT ACA GAA ATC AAG AAC TTA 5219 
Trp Ala Ser Gin Arg Ala Ala Ser Asn Phe Thr Glu He Lys Asn Leu 

1700 1705 1710 

TTG GTC AAC ACT CTA TAC ACC GTC AGA GTG GCT GCG GTG ACT AGT CGT 5267 
Leu Val Asn Thr Leu Tyr Thr Val Arg Val Ala Ala Val Thr Ser Arq 
™ 1715 1720 1725 

GGA ATA GGA AAC TGG AGC GAT TCT AAA TCC ATT ACC ACC ATA AAA GGA 5315 
Gly He Gly Asn Trp Ser Asp Ser Lys Ser He Thr Thr He Lys Gly 
1730 1735 1740 1745 

AAA GTG ATC CCA CCA CCA GAT ATC CAC ATT GAC AGC TAT GGT GAA AAT 53 63 
Lys Val He Pro Pro Pro Asp He His lie Asp Ser Tyr Gly Glu Asn 
15 1750 1755 1760 

TAT CTA AGC TTC ACC CTG ACC ATG GAG AGT GAT ATC AAG GTG AAT GGC 5411 
Tyr Leu Ser Phe Thr Leu Thr Met Glu Ser Asp He Lys Val Asn Gly 

1765 1770 1775 

TAT GTG GTG AAC CTT TTC TGG GGA TTT GAC ACC CAC AAG CAA GAG AGG 5459 
Tyr Val Val Asn Leu Phe Trp Ala Phe Asp Thr His Lys Gin Glu Arg 
po 1780 1785 1790 

AGA ACT TTG AAC TTC CGA GGA AGC ATA TTG TCA CAC AAA GTT GGC AAT 5507 
Arg Thr Leu Asn Phe Arg Gly Ser He Leu Ser His Lys Val Gly Asn 

1795 1800 1805 

CTG ACA GCT CAT ACA TCC TAT GAG ATT TCT GCC TGG GCC AAG ACT GAC 5555 
Leu Thr Ala His Thr Ser Tyr Glu He Ser Ala Trp Ala Lys Thr Asp 
1810 1815 1820 1825 

TTG GGG GAT AGC CCT CTG GCA TTT GAG CAT GTT ATG ACC AGA GGG GTT 5603 
Leu Gly Asp Ser Pro Leu Ala Phe Glu His Val Met Thr Arg Gly Val 

1830 1835 1840 

CGC CCA CCT GCA CCT AGC CTC AAG GCC AAA GCC ATC AAC CAG ACT GCA 5651 
Arg Pro Pro Ala Pro Ser Leu Lys Ala Lys Ala He Asn Gin Thr Ala 

1845 1850 1855 

GTG GAA TGT ACC TGG ACC GGC CCC CGG AAT GTG GTT TAT GGT ATT TTC 5699 
Val Glu Cys Thr Trp Thr Gly Pro Arg Asn Val Val Tyr Gly He Phe 

I860 1865 1870 

TAT GCC ACG TCC TTT CTT GAC CTC TAT CGC AAC CCG AAG AGC TTG ACT 574 7 
Tyr Ala Thr Ser Phe Leu Asp Leu Tyr Arg Asn Pro Lys Ser Leu Thr 
1875 1880 1885 

35 ACT TCA CTC CAC AAC AAG ACG GTC ATT GTC AGT AAG GAT GAG CAG TAT 57 95 

Thr Ser Leu His Asn Lys Thr Val He Val Ser Lys Asp Glu Gin Tyr 
1890 1895 1900 1905 

TTG TTT CTG GTC CGT GTA GTG GTA CCC TAC CAG GGG CCA TCC TCT GAC 5843 
Leu Phe Leu Val Arg Val Val Val Pro Tyr Gin Gly Pro Ser .Ser Asp 

1910 1915 1920 

TAC GTT GTA GTG AAG ATG ATC CCG GAC AGC AGG CTT CCA CCC CGT CAC 58 91 
Tyr Val Val Val Lys Met He Pro Asp Ser Arg Leu Pro Pro Arg His 

1925 1930 1935 

CTG CAT GTG GTT CAT ACG GGC AAA ACC TCC GTG GTC ATC AAG TGG GAA 5939 
Leu His Val Val His Thr Gly Lys Thr Ser Val Val He Lys Trp Glu 

1940 1945 1950 

TCA CCG TAT GAC TCT CCT GAC CAG GAC TTG TTG TAT GCA ATT GCA GTC 5987 
Ser Pro Tyr Asp Ser Pro Asp Gin Asp Leu Leu Tyr Ala He Ala Val 

1955 I960 1965 

AAA GAT CTC ATA AGA AAG ACT GAC AGG AGC TAC AAA GTA AAA TCC CGT 6035 
Lys Asp Leu He Arg Lys Thr Asp Arg Ser Tyr Lys Val Lys Ser Arg 
1970 1975 1980 1985 

AAC AGC ACT GTG GAA TAC ACC CTT AAC AAG TTG GAG CCT GGC GGG AAA 6083 
Asn Ser Thr Val Glu Tyr Thr Leu Asn Lys Leu Glu Pro Gly Gly Lys 

1990 1995 2000 

TAC CAC ATC ATT GTC CAA CTG GGG AAC ATG AGC AAA GAT TCC AGC ATA 6131 
Tyr His lie lie Val Gin Leu Gly Asn Met Ser Lys Asp Ser Ser He 
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2005 2010 ?0-15 

AAA ATT ACC ACA GTT TCA TTA TCA GCA CC?' GAT GCC TTA AAA ATC AO 'A 617 9 
Lys He Thr Thr Val Ser Leu Ser Ala Pro Asp Ala Leu Lys lie He 

2020 2025 2030 

ACA GAA AAT GAT CAT GTT CTT CTG TTT TGG AAA AGC CTG GCT TTA AAG 6227 
Thr Glu Asn Asp His Val Leu Leu Phe Trp Lys Ser Leu Ala Leu Lys 

2035 2040 2045 

GAA AAG CAT TTT AAT GAA AGC AGG GGC TAT GAG ATA CAC ATG TTT GAT 6275 
Glu Lys His Phe Asn Glu Ser Arg Gly Tyr Glu He His Met Phe Asp 
2050 2055 2060 2065 

AGT GCC ATG AAT ATC ACA GCT TAC CTT GGG AAT ACT ACT GAC AAT TTC 6323 
Ser Ala Met Asn He Thr Ala Tyr Leu Gly Asn Thr Thr Asp Asn Phe 

2070 2075 2080 

TTT AAA ATT TCC AAC CTG AAG ATG GGT CAT AAT TAC ACG TTC ACC GTC 63 71 
Phe Lys He Ser Asn Leu Lys Met Gly His Asn Tyr Thr Phe Thr Val 

2085 2090 2095 

CAA GCA AGA TGC CTT TTT GGC AAC CAG ATC TGT GGG GAG CCT GCC ATC 6419 
Gin Ala Arg Cys Leu Phe Gly Asn Gin He Cys Gly Glu Pro Ala He 

2100 2105 2110 

CTG CTG TAC GAT GAG CTG GGG TCT GGT GCA GAT GCA TCT GCA ACG CAG 646 7 
Leu Leu Tyr Asp Glu Leu Gly Ser Gly Ala Asp Ala Ser Ala Thr Gin 

2115 2120 2125 

GCT GCC AGA TCT ACG GAT GTT GCT GCT GTG GTG GTG CCC ATC TTA TTC 6515 
Ala Ala Arg Ser Thr Asp Val Ala Ala Val Val Val Pro He Leu Phe 
2130 2135 2140 2145 

CTG ATA CTG CTG AGC CTG GGG GTG GGG TTT GCC ATC CTG TAC ACG AAG 6 563 
Leu He Leu Leu Ser Leu Gly Val Gly Phe Ala He Leu Tyr Thr Lys 

2150 2155 2160 

CAC CGG AGG CTG CAG AGC AGC TTC ACC GCC TTC GCC AAC AGC CAC TAC 6 611 
His Arg Arg Leu Gin Ser Ser Phe Thr Ala Phe Ala Asn Ser His Tyr 

2165 2170 2175 

AGC TCC AGG CTG GGG TCC GCA ATC TTC TCC TCT GGG GAT GAC CTG GGG 6659 
Ser Ser Arg Leu Gly Ser Ala He Phe Ser Ser Gly Asp Asp Leu Gly 

2180 2185 2190 

GAA GAT GAT GAA GAT GCC CCT ATG ATA ACT GGA TTT TCA GAT GAC GTC 6707 
Glu Asp Asp Glu Asp Ala Pro Met He Thr Gly Phe Ser Asp Asp Val 

2195 2200 2205 

CCC ATG GTG ATA GCC TGAAAGAGCT TTCCTCACTA GAAACCAAAT GGTGTAAATA 6762 
Pro Met Val He Ala 
2210 

TTTTATTTGA TAAAGATAGT TGATGGTTTA TTTTAAAAGA TGCACTTTGA GTTGCAATAT 682 2 
GTTATTTTTA TATGGGCCAA A 6 84 3 



Claims 

1 . DNA having a nucleotide sequence as shown by Sequence ID No. 1 . 

2. An LDL receptor analog protein having an amino acid sequence as shown by Sequence ID No. 2 and coded by 
DNA of Claim 1. 

3. DNA having a nucleotide sequence as shown by Sequence ID No. 5. 

4. An LDL receptor analog protein having an amino acid sequence as shown by Sequence ID No. 6 and coded by 
DNA of Claim 3. 

5. A recombinant vector comprising DNA as shown by Sequence ID No. 1 or 5 and a replicable vector. 

6. Transformant cells which harbor the recombinant vector of Claim 5. 
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A method for the production of an LDL receptor analog protein comprising the steps of culturing the transformants 
of Claim 6 and collectmg a polypeptide produced in the culture. 9 transTormants 
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